Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Learning context‐free grammar rules from a set of program

Learning context‐free grammar rules from a set of program

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The grammar of a programming language is important because it is used in developing program analysis and modification tools. Sometimes programs are written in dialects–minor variations of standard languages. Normally, grammars of standard languages are available but grammars of dialects may not be available. A technique for reverse engineering context-free grammar rules is presented. The proposed technique infers rules from a given set of programs and an approximate grammar is generated using an iterative approach with backtracking. The correctness of the approach, is explained and a set of optimisations proposed to make the approach more efficient. The approach and the optimisations are experimentally verified on a set of programming languages.

References

    1. 1)
      • Van Zaanen, M.: `ABL: alignment-based learning', COLING 2000 – Proc. 18th Int. Conf. Computational Linguistics, August 2000, Saarbr̈ucken, Germany, p. 961–967.
    2. 2)
      • Sellink, A., Verhoef, C.: `Development, assessment, and reengineering of language descriptions', Proc. 4th European Conf. Software Maintenance and Reengineering, March 2000, IEEE Computer Society, p. 151–160.
    3. 3)
      • Mernik, M., Gerlic, G., Zumer, V., Bryant, B.: `Can a parser be generated from examples?', Proc. 18th ACM Symp. Applied Computing, 2003, ACM Press, p. 1063–1067.
    4. 4)
      • R. L̈ammel , C. Verhoef . Semi-automatic grammar recovery. Softw. Pract. Exp. , 15 , 1395 - 1438
    5. 5)
      • M. Crepinsek , M. Mernik , F. Javed , B.R. Bryant , A. Sprague . Extracting grammar from programs: evolutionary approach. SIGPLAN Not. , 4 , 39 - 46
    6. 6)
      • A.V. Aho , R. Sethi , J.D. Ullman . (2002) Compilers principles, techniques, and tools.
    7. 7)
      • J.E. Hopcroft , J.D. Ullman . (1979) Introduction to automata theory, languages, and computation.
    8. 8)
      • Adriaans, P.W.: `Language learning for categorial perspective', November 1992, PhD, University of Amsterdam, Amsterdam, Netherlands.
    9. 9)
      • Geertzen, J., Van Zaanen, M.: `Grammatical inference using suffix trees', Proc. Int. Colloq. Grammatical Inference (ICGI), October 2004, Athens, Greece, p. 63–174.
    10. 10)
      • Dubey, A., Jalote, P., Aggarwal, S.K.: `Inferring grammar rules of programming language dialects', ICGI, September 2006, Tokyo, Japan, LNCS, Springer-Verlag, p. 201–213.
    11. 11)
      • Masaru, T.: `Graph-structured stack and natural language parsing', Proc. 26th Annual Meeting on Association for Computational Linguistics, 1988, Morristown, NJ, USA, Association for Computational Linguistics, p. 249–257.
    12. 12)
      • C. Higuera De La . A bibliographical study of grammatical inference. Pattern Recognition , 1332 - 1348
    13. 13)
      • Java1.5: ‘JDKTM 5.0 documentation’, 2004. URL: http://java.sun.com/j2se/1.5.0/docs/index.html.
    14. 14)
    15. 15)
      • R. L̈ammel , C. Verhoef . Cracking the 500-language problem. IEEE Softw. , 6 , 78 - 88
    16. 16)
      • E.M. Gold . Complexity of automaton identification from given data. Inf. Control , 3 , 302 - 320
    17. 17)
      • Lee, L.: `Learning of context-free languages: a survey of the literature', Technical report TR-12-96, 1996, URL: ftp://deas-ftp.harvard.edu/techreports/tr-12-96.ps.gz.
    18. 18)
      • M. Crepinsek , M. Mernik , V. Zumer . Extracting grammar from programs: brute force approach. SIGPLAN Not. , 4 , 29 - 38
    19. 19)
    20. 20)
      • C*: UNH C* – a data parallel dialect of C, 1998. URL: http://www.cs.unh.edu/~pjh/cstar/.
    21. 21)
      • Dubey, A., Aggarwal, S.K., Jalote, P.: `A technique for extracting keyword based rules from a set of programs', CSMR'05: Proc. 9th European Conf. Software Maintenance and Reengineering (CSMR'05), 2005, Manchester, UK, IEEE Computer Society, p. 217–225.
    22. 22)
      • Grammars: ‘Compilers & interpreters’, July 2000. URL: http://www.angelfire.com/ar/CompiladoresUCSE/COMPILERS.html.
    23. 23)
    24. 24)
      • Kasami, T.: `An efficient recognition and syntax analysis algorithm for context free languages', Technical report AFCRL-65758, Air Force Combridge Research Laboratory, 1965, BedFord, MA.
    25. 25)
      • D.H. Younger . Recognition and parsing of context-free languages in time n3. Inf. Control , 2 , 189 - 208
    26. 26)
      • RC: RC – safe, region-based memory-management for C, 2001. URL: http://berkeley.intel-research.net/dgay/rc/index.html.
    27. 27)
      • E.M. Gold . Language identification in the limit. Inf. Control , 5 , 447 - 474
    28. 28)
    29. 29)
      • Javed, F., Bryant, B.R., Crepinsek, M., Mernik, M., Sprague, A.P.: `Context-free grammar induction using genetic programming', Proc. 42nd Annual Southeast Regional Conf., 2004, ACM Press, p. 404–405.
    30. 30)
      • R. Parekh , V. Honovar , R. Dale , H. Moisl , H. Somers . (2000) Grammar inference, automata induction, and language acquisition.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-sen_20070061
Loading

Related content

content/journals/10.1049/iet-sen_20070061
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address