Textsdating net


As libraries have increasingly come to recognize the value of digitizing historical works in their holdings, many institutions with significant collections of Chinese materials have committed themselves to large-scale scanning projects, often making the resulting images freely available over the internet.

While an enormously positive development in itself, for many scholarly use cases this represents only the first step towards adequate digitization of these works.

Since its origins in 2005 as an online search tool for a small number of classical Chinese texts, the Chinese Text Project has grown to become one of the largest and most widely used digital libraries of pre-modern Chinese writing, containing tens of thousands of transmitted texts dating from the Warring States through to the late Qing and republican period, while also serving as a platform for the application of digital methods to the study of pre-modern Chinese literature.

Unlike most digital libraries and full-text databases, users of the site are not passive consumers of its materials, but instead active curators through whose work it is maintained and developed – and increasingly, not all users of the library are human.

Volunteers located around the world correct mistakes and add modern punctuation to the texts as time allows and according to their own interests – typically hundreds of corrections are made each day.

Digital methods have revolutionized many aspects of the study of pre-modern Chinese literature, from the simple but transformative ability to perform full-text searches and automated concordancing, through to the application of sophisticated statistical techniques that would be entirely impractical without the aid of a computer.

While the methods themselves have evolved significantly – and continue to do so – one of the most fundamental prerequisites to almost all digital studies of Chinese literature remains access to reliable digital editions of these texts themselves.


OCR software must correctly identify all of these instances as corresponding to the same abstract character – a challenging task for a computer.In an attempt to address this problem, the Chinese Text Project has developed a hybrid system, in which uncorrected OCR results are imported directly into a database system providing full-text search of the source images and assembling the contents of the scanned images of pages into complete textual transcriptions, while also providing an integrated mechanism for users to directly correct the data.



  1. Pingback:

  2. eric   •  

    If you love horror-ish movies about chatting and web-stuff, then perhaps, but there are still a lot of better choices out there. Kind of a shame with this movie, because it is pretty well made.

  3. eric   •  

    Vraiment très efficace, et son chat par webcam donne accès à tous les membres en ligne Notre note 8/10 Les Lives cam, et chat cam Encore plus de sélection et de bons plans. Une sélection des meilleurs webcameuses francophone, des astuces et beaucoup d'autres choses.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>