Collection Report

Collection name: The Language Archive

URL: https://curation.clarin.eu/collection/The_Language_Archive.xml

Total Score: 434524.3571 out of 517995.0000

Score percentage: 83.9%

Average Score: 12.58 out of 15.00

Maximal score in collection: 13.30

Minimal score in collection: 9.91


Creation time: 2021-12-01 20:06:28.870+01:00 [Europe/Vienna]


File Section

Number of files: 34533

Total size: 693985960 B

Average size: 20096 B

Minimal file size: 1720 B

Maximal file size: 16359447 B


Header Section

Profiles in Collection
ID Score Count
Total number of profiles: 14
clarin.eu:cr1:p_1361876010653 1.80 1
clarin.eu:cr1:p_1375880372947 1.72 41
clarin.eu:cr1:p_1396012485083 1.69 193
clarin.eu:cr1:p_1328259700928 1.64 383
clarin.eu:cr1:p_1407745712035 1.56 28925
clarin.eu:cr1:p_1430905751641 1.46 18
clarin.eu:cr1:p_1417617523856 1.46 295
clarin.eu:cr1:p_1345561703620 1.29 60
clarin.eu:cr1:p_1361876010525 1.29 1
clarin.eu:cr1:p_1475136016242 1.28 128
clarin.eu:cr1:p_1366895758243 1.27 263
clarin.eu:cr1:p_1475136016239 1.25 59
clarin.eu:cr1:p_1407745712064 1.06 4156
clarin.eu:cr1:p_1456409483202 0.88 9

Facet Section

name coverage
facet-coverage: 85.7%
languageCode 73.1%
collection 82.4%
resourceClass 0.0%
modality 61.9%
format 69.1%
keywords 0.0%
genre 85.4%
subject 13.6%
country 82.9%
organisation 54.4%
name 96.1%
description 75.4%
license 0.0%
availability 4.2%

ResourceProxy Section

Total number of resource proxies: 363787

Average number of resource proxies: 10.53

Total number of resource proxies with MIME: 363228

Average number of resource proxies with MIME: 10.52

Total number of resource proxies with reference: 363787

Average number of resource proxies with references: 10.53


XML Validation Section

Number of Records: 34532

Number of valid Records: 34532

Ratio valid Records: 100.0%


XML Populated Section

Total number of XML elements: 9178388

Average number of XML elements: 265.79

Total number of simple XML elements: 7047952

Average number of simple XML elements: 204.09

Total number of empty XML elements: 1788998

Average number of empty XML elements: 51.81

Average rate of populated elements: 74.6%


URL Validation Section

Total number of links: 363784

Average number of links: 10.53

Total number of unique links: 352730

Total number of checked links: 311731

Total number of undetermined links: 8

Average number of unique links: 10.21

Total number of broken links: 4317

Average number of broken links: 0.12

Ratio of valid links: 98.6%

Link Checking Results

Category Count Average Response Duration(ms) Max Response Duration(ms)
Broken 4317 282.39 4,427
Ok 307406 2,216.02 9,774
Undetermined 8 0 0


Invalid Files Section

  1. Invalid file: /usr/local/curation-module/data/clarin/results/cmdi/The_Language_Archive/oai_www_mpi_nl_lat_1839_00_0000_0000_0021_629E_3.xml
    Reason: Record oai_www_mpi_nl_lat_1839_00_0000_0000_0021_629E_3.xml has size: 16359447 bytes but the allowed limit when processing collections is 10000000 bytes.