Abstract
The computational complexity of learning Boolean concepts from examples is investigated. It is shown for various classes of concept representations that these cannot be learned feasibly in a distribution-free sense unless R = NP. These classes include (a) disjunctions of two monomials, (b) Boolean threshold functions, and (c) Boolean formulas in which each variable occurs at most once. Relationships between learning of heuristics and finding approximate solutions to NP-hard optimization problems are given.
- 1 ANGLUIN, D. On the complexity of minimum inference of regular sets. Inf. Control 39 (1978), 337-350.Google Scholar
- 2 ANGLUIN, D. Finding patterns common to a set of strings. J. Comput. Syst. Sci. 21 (1980), 46-62.Google Scholar
- 3 ANGLUIN, D.Inductive inference of formal languages from positive data. Inf. Control 45 (1980), 117-135.Google Scholar
- 4 ANGLUIN, n. Inference of reversible languages. J. ACM 29, 3 (July 1982), 741-765. Google Scholar
- 5 ANGLUIN, D. Remarks on the difficulty of finding a minimal disjunctive normal form for Boolean functions. Unpublished manuscript.Google Scholar
- 6 ANGLUIN, n. Learning regular sets from queries and counter-examples. Yale University Tech. Rep. YALEU/DCS/464, 1986.Google Scholar
- 7 ANGLUIN, n., AND SMITH, C. inductive inference: Theory and methods. AC;vl Comput. Surv. 15, 3 (Sept. 1983), 237-269. Google Scholar
- 8 BLUM, L., AND BLUM, M. Toward a mathematical theory of inductive inference. Inf. Control 28 (1975), 125-155.Google Scholar
- 9 BLUMER, A., EHRENFEUCHT, m., HAUSSLER, D., AND WARMUTH, M. Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension. In Proceedings of the 18th Annual Symposium on the _Theory of Computing (Berkeley, Calif., May 28-30). ACM, New York~ !986, pp. 273-282. Google Scholar
- 10 BLUMER, A., EHRENFEUCHT, A., HAUSSLER, D., AND WARMUTH, M. Occam's Razor. Inf. Process. Lett. 24 (1987), 377-380. Google Scholar
- 11 CASE, J., AND SMITH, C. Comparison of identification criteria for machine inductive inference. Theoret. Comput. Sci. 25 (1983), 193-220.Google Scholar
- 12 CHVATAL, V. A greedy heuristic for the set covering problem. Math. Oper. Res. 4, 3 (1979), 233-235.Google Scholar
- 13 DALEY, R. On the error correcting power of pluralism in BC-type inductive inference. Theoret. Comput. Sci. 24 (1983), 95-104.Google Scholar
- 14 DIETTERICH, T. C., AND MICHALSKI, R.S. A comparative review of selected methods for learning from examples. In Machine Learning. An Artificial Intelligence Approach. Tioga, Palo Alto, Calif., 1983.Google Scholar
- 15 FREIVALD, R.V. Functions computable in the limit by probabilistic machines. In Mathematical Foundations of Computer Science (3rd Symposium at Jadwisin near Warsaw, 1974). Springer- Verlag, New York, 1975. Google Scholar
- 16 FREIVALD, R.V. Finite identification of general recursive functions by probabilistic strategies. In Proceedings of the Conference on Algebraic, Arithmetic, and Categorial Methods in Computation Theory. Akadamie-Verlag, New York, 1979, pp. 138-145.Google Scholar
- 17 GAREY, M., AND JOHNSON, D. The complexity of near-optimal graph coloring. J. ACM 23, 1 (Jan. 1976), 43-49. Google Scholar
- 18 GAREY, M. AND JOHNSON, D. Computers and Intractability: A Guide to the Theory of NP- Completeness. Freeman, San Francisco, 1979. Google Scholar
- 19 GILL, J. Computational complexity of probabilistic Turing machines. SIAM J. Comput. 6 (1977), 675-695.Google Scholar
- 20 GOLD, E.M. Complexity of automaton identification from given data. Inf. Control 37 (1978), 302-320.Google Scholar
- 21 GOLDREICH, O., GOLDWASSER, S., AND MICALI, S. How to construct random functions. J. ACM 33, 3 (July 1986), 792-807. Google Scholar
- 22 HA USSLER, D. Quantifying the inductive bias in concept learning. In Proceedings of AAAI-86 (Philadelphia, Pa.). Morgan Kaufman, Los Altos, Calif., 1986, pp. 485-489.Google Scholar
- 23 HORNING, J.J. A study of grammatical inference. Ph.D. Dissertation. Computer Science Dept., Stanford Univ., Stanford, Calif., 1969. Google Scholar
- 24 JOHNSON, D. S. Worst case behaviour of graph coloring algorithms. In Proceedings of the 5th South-Eastern Conference on Combinatorics, Graph Theory, and Computing. Utilitas Mathematica, Winnipeg, Canada, 1974, pp. 513-528.Google Scholar
- 25 LEVIN, L. Universal sorting problems. Prob. Pered. Inf. 9, 3 (1973), pp. I 15-116.Google Scholar
- 26 MICHALSKI, R. S., CARBONELL, J. G., AND MITCHELL, T. M. Machine Learning." An Artificial Intelligence Approach. Tioga, Palo Alto, Calif., 1983. Google Scholar
- 27 PITT, L. A characterization of probabilistic inference. In Proceedings of the 25th Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Washington, D.C., 1984, pp. 485-494.Google Scholar
- 28 PODNIEKS, K.M. Probabilistic synthesis of enumerated classes of functions. Soy. Math. Dokl. 16 (1975), I042-1045.Google Scholar
- 29 ROYER, j.S. On machine inductive inference of approximations. Inf. Control, to appear. Google Scholar
- 30 RUDICH, S. Inferring the structure of a Markov chain from its output. In Proceedings of the 26th Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, Washington, D.C., 1985, pp. 321-325.Google Scholar
- 31 SMITH, C. H., AND VELAUTHAPILLAI, M. On the inference of approximate programs. Tech. Rep. 1427, Dept. of Computer Science, Univ. of Maryland, College Park, Md.Google Scholar
- 32 VALIANT, L.G. A theory ofthe learnable. Commun. ACM27, 11 (1984), 1134-1142. Google Scholar
- 33 VALIANT, L. G. Learning disjunctions of conjunctions. In Proceedings of the 9th IJCAI (Los Angeles, Calif., Aug. 1985), vol. 1. Morgan Kaufman, Los Altos, Calif., 1985, pp. 560-566.Google Scholar
- 34 WIEHAGEN, R., FREIVALD, R., AND KINBER, E. B. On the power of probabilistic strategies in inductive inference. Theoret. Comput. Sci. 28 (1984), I I 1-133.Google Scholar
- 35 WIGDERSON, A. A new approximate graph coloring algorithm. In Proceedings of the 14th Annual Symposium on the Theory of Computing. ACM, New York, 1982, pp. 325-329. Google Scholar
Index Terms
- Computational limitations on learning from examples
Recommendations
Complexity theoretic limitations on learning halfspaces
STOC '16: Proceedings of the forty-eighth annual ACM symposium on Theory of ComputingWe study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution D on Q^n X {-1,1}. We define Err_H(D) as the least error of a halfspace classifier for D. A learner who can access D has to return a ...
Comments