Tb-Scout v2.5 is an extraction tool, not a conversion tool

First of all, a reminder. Tb-Scout v2.5 only extracts and exports data in plain text, it does not export data in other formats like TBX, XML, etc. Please keep this in mind.

What type of files are exported?

Handling numerous records
Figure 1.

All the files to be exported are basically in the Excel (.xlsx) or PDF formats. There is only one exception: the basic termbase statistics report is exported in text (.txt) format.

Can the data exported be reused?

Handling numerous records
Figure 2.

To answer this question, first review the information provided in the section Exploring, searching, finding and extracting data to Excel (or PDF).

Only the data exported in a bilingual layout (i.e. three columns: Concept ID, source language and target language) is ready to be used in other scenarios. This applies to exporting terms by language pairs and to exporting a basic dictionary.

These Microsoft Excel three-column files

  • can be exported or saved as "Text (Tab delimited) (*.txt)", or
  • can be exported to XML format using MultiTerm 2021/2019/2017 Convert.

Files in these formats (.txt and .xml) can be reused to create or grow SDL termbases or to make conversions into other CAT tools.

On the other hand, the data filtered by descriptive field content, as shown above in Figure 2, is formatted in such a way that it could be reused only if the user knows how to merge different tables by using a database program, in case a new termbase will be created from the exported data.

In the picture above, the basic three levels of a termbase are clearly indicated in the "Level" column, depending on the termbase being explored:
 (1) Entry level, (2) Language level and (3) Term level. The (2) Language level is also known as Index level and that is why Tb-Scout v2.5 marks them using an "I-" prefix: I-English, I-Spanish, I-Chinese, and so on. For the example at hand, we have extracted all the records with a descriptive field called "Notes", which in this particular termbase is included at all three levels.

 Needless to say, it is important to understand these three levels
 (1) Entry level, (2) Language level and (3) Term level,
in order to understand how MultiTerm works.

Making a recap, for the example illustrated in Figure 2, above, we used a result dataset from data filtered by descriptive field content and a bird's eye view of a particular entry ("Concept ID" number 6).

[If you want to learn more about how a termbase is structured according to established standards, jump to more information about terminology.]

How many instances can be running at a time?

  • You can only run one instance of the application at a time. If you try to run another instance, you will be informed, and the application will close. This feature is intended to avoid unintended cache data corruption and/or inconsistent search results when exploring any given termbase.
  • When the application closes unexpectedly, and therefore it is left in an unknown state, the next time you try to open the application the system will assume you are trying to run another instance and the application will have to be closed. When you open the application for the second time, it will run normally.

How many terms can Tb-Scout v2.5 handle?

Handling numerous records
Figure 3.

In the example above, a subset of the IATE extensive collection of terminology, Tb-Scout v2.5 quickly found out that this particular termbase has 6 languages and a total of 641,642 terms, which it can handle with no problems when it comes to applying either basic searches or enhanced searches.

However, the application is limited when it comes to searching through these terms for descriptive fields and/or dates in a particular language pair.

Usually what may slow or halt the application is more the number of descriptive fields than the number of records. By running a basic termbase statistics report you may know beforehand what to expect if you see several descriptive fields listed.

When there are no descriptive fields, the application will indicate so ("No Descriptive Fields found"), Figure 4, in the Export module, below pane (1), unlike the green label that appears when the termbase being explored has descriptive fields.

Handling numerous records
Figure 4.

Even with no descriptive fields it may be possible to jump to the Export module but processing may not work effectively if the termbase has more than 30,000 terms per language pair. In those cases, the only choice is to generate a bilingual dictionary in PDF format.

Yet, if all you care is getting a language pair dataset of terms only, you may want to try the By TERMS ONLY process instead. In some instances, the application may even be able to deal with larger termbases, but we do not want to overpromise.

It may be possible to export huge bilingual dictionaries

Handling numerous records
Figure 5.

Figure 5 above shows a screen snapshot of a bilingual dictionary generated by Tb-Scout v2.5. It shows the footer (of the previous page) and the header (of the next page) of a dictionary in PDF format with 3,400 pages (pages, not terms!) processed and exported by the application using the Export a basic dictionary feature.

The application clipboard is limited

Handling numerous records
Figure 6.

In the example above there is a dataset with 143,972 records ready to be exported to an Excel file, however the application clipboard can only handle 65,534 records and therefore it is advisable to opt for a PDF file, instead. This option may fail as well, so as a last resort export the data as a bilingual dictionary.

Read-only termbases

In general read-only termbases, marked with a red asterisk (as shown below) cannot be explored with the Export module, because information other than term themselves is encrypted and it is not stored in plain text but as 'binary data'. Read-only termbases can however be searched using the basic and enhanced search modes. Besides, the good news is that, most of the times, you can still explore them to extract bilingual data, i.e. terms. Read-only termbases might have unpredictable behavior under Tb-Scout v2.5.

Read-only termbases
Figure 7.

Application size

As explained in the clear cache storage page, when you browse through different termbases, Tb-Scout v2.5 saves data in a temporary storage, called cache, a fact that will make the application file grow in size. For ideal performance a size lower than 10 MB should be maintained, although during a normal processing task it is acceptable to have even more than 100 MB in size but, at any rate, the label that indicates the size of the application (at the bottom-right corner) will change to red in color when a certain threshold has been reached and then it is advisable to clear the cache storage.

Read-only termbases
Figure 8.

Application not responding

At some point, due to a number of circumstances, by due mainly to a process within a big termbase which causes the application to stop responding, you will see this message with three options:

Read-only termbases
Figure 9.

The first option to consider is to "Wait for the program to respond" and in many cases that is what will happen, the program will respond.

If the application definitely will not respond, choose "Close the program" rather than "Restart the program". In this case, since the application is closed abruptly, that will leave the application marked as a running instance. The next time you start the application you will have to close and then restart it. In addition, once you close the application (second menu choice, above), press [Cancel] immediately, since there is no need to send a crash report to Microsoft.

 


Known issues