The extraction of Corsa can be split into 2 parts, database and document extraction.
Since the document management system Corsa is using a relational database, extracting data from it involves tying columns together. In many tables there is usually an object id and an object type. The object type can be poststuk, dossier, case, etc. Each object id is unique for each type. This means that the same object id can be used for multiple object types.
Document extractionDocuments are stored on a fileshare by the Document Server. The directory structure is usually as follows (depends on configuration):
$share$/ds_files/<database name>/<parent type>/<file category>/<hashed folders>/<hashed filename>.<extension>
The parent type is either
A(agenda). The file category is
ocr. The hashing is done by converting the object id to a hexadecimal value. Before converting, for poststuk the object id has to be 10 characters long and for agenda it has to be 30 characters long. Any object id that has less characters, needs spaces prefixed to the object id. The hashed value has the version number of the document, minus the first 0, appended to the hexadecimal value. After that the extension is appended and the entire value is split into chuncks of 8 hexadecimal characters to create the hashed folder and filename.
As an example, let's take a poststuk that has object id
10IN12345and a TIF document with version
Step 1: prefix the object id with spaces until it has a length of 10 characters. Result:
Step 2: convert to a hexadecimal value. Result:
Step 3: append the document version number without the first 0 and its extension. Result:
Step 4: split into chunks of 8 characters. Result:
Now we can generate the paths for both the original and archive copies, namely:
RobotsThere is a set of robots available which are attached as a zip file to this article.
At this moment these robots do use the UDM, but they do not use it according to the UDM design philosophy. We are in the process of rewriting them, but still believe this will be a good set of robots to get started with.