Collection Report
Collection name: The Language Archive
Total Score: 1663099.5524 out of 1993815.0000
Score percentage: 83.4%
Average Score: 12.51 out of 15.00
Maximal score in collection: 13.61
Minimal score in collection: 9.74
File Section
General information on the number of files and the file size.
Number of files: 132921
Total size: 2600975636 B
Average size: 19567 B
Minimal file size: 1649 B
Maximal file size: 16359447 B
Header Section
The header section shows information on the profile usage in the collection.
Important note: the score of this section differs from the score of the underlying profile. For more information on scoring have a look at the FAQ , please.
ID | Score | Count |
---|---|---|
Total number of profiles: 16 | ||
clarin.eu:cr1:p_1430905751641 | 2.46 | 98 |
clarin.eu:cr1:p_1361876010653 | 1.80 | 9 |
clarin.eu:cr1:p_1375880372947 | 1.72 | 225 |
clarin.eu:cr1:p_1396012485083 | 1.69 | 1265 |
clarin.eu:cr1:p_1328259700928 | 1.64 | 1920 |
clarin.eu:cr1:p_1331113992512 | 1.59 | 616 |
clarin.eu:cr1:p_1407745712035 | 1.57 | 104101 |
clarin.eu:cr1:p_1417617523856 | 1.47 | 2529 |
clarin.eu:cr1:p_1345561703620 | 1.29 | 577 |
clarin.eu:cr1:p_1361876010525 | 1.29 | 1 |
clarin.eu:cr1:p_1475136016242 | 1.28 | 1049 |
clarin.eu:cr1:p_1366895758243 | 1.27 | 1456 |
clarin.eu:cr1:p_1475136016239 | 1.25 | 374 |
clarin.eu:cr1:p_1337778924955 | 1.14 | 3824 |
clarin.eu:cr1:p_1407745712064 | 1.06 | 14865 |
clarin.eu:cr1:p_1456409483202 | 0.88 | 11 |
Facet Section
The facet section shows the facet coverage within the collection. It's quite evident that the facet coverage of a certain CMD file can't be higher than those of the profile it is based on.
name | coverage |
---|---|
facet-coverage: 92.9% | |
languageCode | 59.6% |
collection | 81.5% |
resourceClass | 0.0% |
modality | 47.8% |
format | 52.1% |
keywords | 0.0% |
genre | 72.0% |
subject | 17.0% |
country | 63.1% |
organisation | 45.2% |
name | 77.9% |
description | 65.7% |
license | 0.0% |
availability | 3.7% |
ResourceProxy Section
The resource proxy section shows information on the number of resource proxies on the kind (the mime type) of resources. A resource proxy is a link to an external resource, described by the CMD file.
Total number of resource proxies: 1192695
Average number of resource proxies: 8.97
Total number of resource proxies with MIME: 1189562
Average number of resource proxies with MIME: 8.95
Total number of resource proxies with reference: 1192695
Average number of resource proxies with references: 8.97
XML Validation Section
The XML validation section shows the result of a simple validation of each CMD file against its profile.
Number of Records: 132920
Number of valid Records: 132919
Ratio valid Records: 100.0%
Invalid Records:
File | Info | Validate |
---|---|---|
clarin/results/cmdi/The_Language_Archive/oai_www_mpi_nl_lat_1839_c5a1a1db_a71d_4974_840f_ca92fd6f92b5.xml | ||
|
XML Populated Section
The XML populated section shows information on the number of xml elements and the fact if these elements are conatining data.
Total number of XML elements: 33988771
Average number of XML elements: 255.71
Total number of simple XML elements: 25885937
Average number of simple XML elements: 194.75
Total number of empty XML elements: 5971884
Average number of empty XML elements: 44.93
Average rate of populated elements: 76.9%
URL Validation Section
The URL validation section shows information on the number of links and the results of link checking for the links which have been checked so far.
Total number of links: 1192617
Average number of links: 8.97
Total number of unique links: 1099075
Total number of checked links: 265520
Total number of undetermined links: 0
Average number of unique links: 8.27
Total number of broken links: 15206
Average number of broken links: 0.11
Ratio of valid links: 94.3%
Link Checking Results
Category | Count | Average Response Duration(ms) | Max Response Duration(ms) |
---|---|---|---|
Ok | 250106 | 2,543.04 | 9,673 |
Restricted_Access | 3 | 110 | 300 |
Blocked_By_Robots_txt | 205 | 0 | 0 |
Broken | 15206 | 578.77 | 3,687 |
Invalid Files Section
The invalid files section shows the number of non processed CMD-files of a collection and the reason for not processing these files.
- Invalid file: clarin/results/cmdi/The_Language_Archive/oai_www_mpi_nl_lat_1839_00_0000_0000_0021_629E_3.xml
Reason: Record oai_www_mpi_nl_lat_1839_00_0000_0000_0021_629E_3.xml has size: 16359447 bytes but the allowed limit when processing collections is 10000000 bytes.