Retrieve files from the repository

From OSSelot
Revision as of 15:12, 13 June 2025 by CarstenEmde (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Retrieve files from the project's Github repository

When accessing project files via scripts or programs, it must be taken into account that these may only be available in gzip-compressed format, as otherwise their size would exceed the limit set by Github. Therefore, if an access error occurs, a second attempt should be made, appending the file extension “.gz” to the file name and decompressing the file after downloading.

Example local file access with a Python script

#!/usr/bin/env python3
import json
import gzip

def readOSSelot(name):
    osselotjson = ''
    try:
        osselotfile = open(name, 'r')
        jsonloader = json.load
        jsoncloser = osselotfile
    except:
        try:
            gzipfile = gzip.open(name + '.gz', 'rb')
            osselotfile = gzipfile.read()
            jsonloader = json.loads
            jsoncloser = gzipfile
        except:
            return ''
    try:
        osselotjson = jsonloader(osselotfile)
    except json.decoder.JSONDecodeError:
        pass
    jsoncloser.close()
    return osselotjson

name = 'package-analysis/analysed-packages/bash/version-5.1.16/bash-5.1.16.spdx.json'
bash = readOSSelot(name)
if bash != '':
    print(bash['packages'][0]['name'])

name = 'package-analysis/analysed-packages/linux/version-6.0/linux-6.0.spdx.json'
linux = readOSSelot(name)
if linux != '':
    print(linux['packages'][0]['name'])

Run the script:

$ readOSSelot.py 
bash-5.1.16.tar.gz
linux-6.0.tar.xz

In both cases, the correct file was opened and its contents returned, even though in the second case the name of a non-existent file was specified.

Example Web server access with a Python script

#!/usr/bin/env python3
import json
from urllib import request
import gzip

def fetchOSSelot(url):
    try:
        osselot = request.urlopen(url)
        osselotdata = osselot.read()
    except:
        try:
            osselotgz = request.urlopen(url + '.gz')
            osselotgzdata = osselotgz.read()
            osselotdata = gzip.decompress(osselotgzdata)
        except:
            return ''
    try:
        osselotjson = json.loads(osselotdata)
        return osselotjson
    except json.decoder.JSONDecodeError:
        return ''

url = 'https://github.com/Open-Source-Compliance/package-analysis/raw/refs/heads/main/analysed-packages/bash/version-5.1.16/bash-5.1.16.spdx.json'
bash = fetchOSSelot(url)
if bash != '':
    print(bash['packages'][0]['name'])

url = 'https://github.com/Open-Source-Compliance/package-analysis/raw/refs/heads/main/analysed-packages/linux/version-6.0/linux-6.0.spdx.json'
linux = fetchOSSelot(url)
if linux != '':
    print(linux['packages'][0]['name'])

Run the script:

$ fetschOSSelot.py 
bash-5.1.16.tar.gz
linux-6.0.tar.xz

As above, but here when accessing the file via network, the correct file was downloaded and its contents returned, even though in the second case the name of a non-existent file was specified.

Retrieving uncompressed files from the OSSelot repository mirror with a Python script

Finally, if decompressing is by any reason not an option, uncompressed files can be retrieved immediately from the repository mirror at the OSSelot Web site

#!/usr/bin/env python3
import json
from urllib import request
import gzip

def fetchOSSelot2(url):
    try:
        osselot = request.urlopen(url)
    except:
        try:
            osselot = request.urlopen(url.replace('github.com', 'osselot.org'))
        except:
            return ''
    osselotdata = osselot.read()
    try:
        osselotjson = json.loads(osselotdata)
        return osselotjson
    except json.decoder.JSONDecodeError:
        return ''

url = 'https://github.com/Open-Source-Compliance/package-analysis/raw/refs/heads/main/analysed-packages/bash/version-5.1.16/bash-5.1.16.spdx.json'
bash = fetchOSSelot2(url)
if bash != '':
    print(bash['packages'][0]['name'])

url = 'https://github.com/Open-Source-Compliance/package-analysis/raw/refs/heads/main/analysed-packages/linux/version-6.0/linux-6.0.spdx.json'
linux = fetchOSSelot2(url)
if linux != ''
    print(linux['packages'][0]['name'])

Run the script:

$ fetschOSSelot2.py 
bash-5.1.16.tar.gz
linux-6.0.tar.xz

Again as above, when accessing the file via network, the correct file was downloaded and its contents returned, even though in the second case the name of a file was specified that only exists in uncompressed form on the repository mirror at the OSSelot Web site.