Convert bitmap images of chemical structural formulas into MDL MOL files.
MLOCSR performs optical recognition of chemical formulae in two stages. First, candidate points, lines, and formula text boxes are extracted by a low-level processor. Second, Markov Logic is used to reason about the low-level entities and relations to construct a molecular structure which is finally output in the form of a MOL file.
Upload an image file containing a chemical diagram and obtain an editable vector representation of the molecule. Supported formats include jpg, png, jpeg, and TIF. Only single molecules are supported at the moment which means, in particular, that images with tables and/or reactions are not supported. Sample images (from the USPTO data set used for the development of OSRA) can be downloaded from here.
A standalone distribution is currently under development.