One of the problems central to data consistency is data repairing. Given a database
violating a set
of data dependencies as data quality rules, it aims to modify
for a new relation
is a centralized database, a host of methods have been provided to address this problem. In practice, a database may be fragmented and distributed to multiple sites, which is advocated by distributed systems for better scalability and is readily supported by commercial systems. This paper makes a first effort to develop techniques for repairing functional dependency violations in a horizontally partitioned database. (1) Based on a message-passing distributed computing model and two complexity measures (parallel time and data shipment) for distributed algorithms, we study data repairing with equivalence classes in the distributed setting. We show that it is NP-completeto build equivalence classes when the data is horizontally partitioned, and when we aim to minimize either data shipment or parallel computation time. (2) Despite the intractability, we propose efficient distributed algorithms and optimization techniques for data repairing based on equivalence classes. (3) We experimentally verify the effectiveness and efficiency of our algorithms, using both real-life and synthetic data.