Khemeia™
Khemeia™ is an Expert System with over 80 pattern recognition and structuring algorithms, enabling the software to detect content elements in a document like the human eye (title, sub-title, paragraph styles, bulleted and numbered lists, images, tables etc.). Also embedded in the software are the S1000D standards and its industry variants including Raildex and Shipdex for the railway and shipping sectors, S2000M for Materials Management and customer specific XML and Json Schemas
Khemeia™ automates the entire transformation process and delivers much more than that. Using Artificial Intelligence techniques, Khemeia™ systematically extracts and semantically tags meta-data, it structures and hierarchically organizes information, generates Table of Contents and converts them to XML-based outputs – all in real-time.
Khemeia™ creates structured content from Paper and PDF, Word, ASCII, OCR (Optical Character Recognition), RTF, Excel, CSV, SGML, QuarkExpress, Adobe InDesign and HTML.
Unlike AI based software solutions which require a minimum of 10,000 documents for testing and deployment, Khemeia™ can be tested and deployed with a maximum of 10 documents, significantly reducing implementation lead times. The software requires no customization to transform all categories of technical documentation like Maintenance Manuals, Service Bulletins, Job Information Cards, …
Detection of content elements in a class of documents as defined in the customer DTD (Document Type Definition) or XML Schema,for example: Section titles, Numbers, Header, Paragraphs, Hyperlinks, Tables, Graphics.
Content elements extracted are semantically tagged – Section titles (court name), Header (case name), Numbers (page numbers), Paragraphs (alinea), Tables (Evidence List).
This involves: Splitting the document into relevant modules, Create the hierarchy, Format bullet lists, Generate Table of Contents, Create linkages.
Output types XML, PDF, HTML, DITA, JPEG, XMP, NITF, NewsML, S1000D, Customer-specific.