A model for the unsupervised segmentation and linguistic analysis of Arabic texts of Prophetic tradition (
s), SALAH, is proposed. The model automatically segments each text unit in a transmitter chain (
) and a text content (
) and further analyses each segment according to two distinct pipelines: a set of regular expressions chunks transmitter chains in a graph labeled with the relation between transmitters, while a tailored, augmented version of the AraMorph morphological analyzer (RAM) analyzes and annotates lexically and morphologically the text content. A graph with relations among transmitters and a lemmatized text corpus, both in XML format, are the final output of the system, which can further feed the automatic generation of concordances of the texts with variable-sized windows. The model results can be useful for a variety of purposes, including retrieving information from
texts, verify the relations between transmitters, finding variant readings, supplying lexical information to specialized dictionaries.