Text Analysis

Support ScriptSource


Summary List

  1. Alphabets, syllabaries and other systems for representing languagesWriting Systems

  2. Scientific systems for representing speech sounds, such as the IPA (International Phonetic Alphabet)Phonetic Writing Systems

  3. Graphemes used by writing systems and the phonemes they representGraphemes and Phonemes

  4. Methods and guidelines for developing writing system softwareDeveloping WSI Software

  5. Resources for developing localized software, including the CLDRLocalization

  6. How phonemes are represented or encodedText Encoding

  7. The international character-encoding standardUnicode

  8. Converting between text encodings, and particularly into UnicodeEncoding Conversion

  9. Sorting, collation, segmentation, transcription, transliterationText Analysis

  10. Converting text between scriptsTransliteration

  11. Systems and resources for entering text on digital devices Keyboards and Data Entry

  12. Font design, development, installation and useFonts and Typefaces

  13. Tools and techniques of font-makingFont Development Tools

  14. Adapting fonts for web sites and applicationsWeb Fonts

  15. Adapting fonts for use on mobile devicesFonts for Mobile Devices

  16. Systems for transforming encoded text into rendered glyphsShaping and Rendering

  17. SIL's open-source rendering engineGraphite

  18. Smart-font technology developed by Adobe and MicrosoftOpenType

  19. Tools and ideas for publishing texts involving complex writing systemsPublishing and Typography

  20. Writing system software licensing models, including the OFLSoftware Licensing

  21. Blogs

  22. News and articles about ScriptSourceScriptSource Blog

  23. News and articles about computing for writing system implementationWriting Systems Blog

  24. News and articles about font design and developmentFont Development Blog

  25. General information about this siteScriptSource Help


This topic includes a wide variety of types of text analysis - sorting, collation, segmentation, transcription, transliteration.

Another rich source of information on this topic is the  NRSI Computers & Writing Systems site.


Entries for this topic

Entries can contain text, graphics, media, files and software. Click on the entry title to see full details.

Character Identifier Tool from Tavultesoft
Unicode Sort Tailoring: Tutorial


Blog posts for this topic

These are posts from the blogs on this site; the full blogs can be accessed under the Topics link.

There are no blog posts for this topic.


Discussions for this topic

Discussions include ideas, opinions or questions that invite comments from other ScriptSource users.

There are no discussions yet for this topic.


Sources for this topic

Sources are references to books, web pages, articles and other materials. Click on the source title to see full details.

Title Type
Unicode Word Macros Template - web page


Needs related to this topic

These are unmet needs for fonts, keyboards, other software and script-related information. Click on the need title to see full details.

Description Category Status
OCR in Ethiopic, Bengali, Gujarati scripts Software Unmet

Copyright © 2017 SIL International and released under the  Creative Commons Attribution-ShareAlike 3.0 license (CC-BY-SA) unless noted otherwise. Language data includes information from the  Ethnologue. Script information partially from the  ISO 15924 Registration Authority. Some character data from  The Unicode Standard Character Database and locale data from the  Common Locale Data Repository. Used by permission.