Retrieve files from the repository

From OSSelot
Revision as of 12:47, 13 June 2025 by CarstenEmde (talk | contribs) (Added output of script and explanation below the codes)
Jump to navigation Jump to search

Retrieve files from the project's Github repository

When accessing project files via scripts or programs, it must be taken into account that these may only be available in gzip-compressed format, as otherwise their size would exceed the limit set by Github. Therefore, if an access error occurs, a second attempt should be made, appending the file extension “.gz” to the file name and decompressing the file after downloading.

Example local file access with a Python script

#!/usr/bin/env python3
import json
import gzip

def readOSSelot(name):
    osselotjson = ''
    try:
        osselotfile = open(name, 'r')
        jsonloader = json.load
        jsoncloser = osselotfile
    except:
        try:
            gzipfile = gzip.open(name + '.gz', 'rb')
            osselotfile = gzipfile.read()
            jsonloader = json.loads
            jsoncloser = gzipfile
        except:
            return ''
    try:
        osselotjson = jsonloader(osselotfile)
    except json.decoder.JSONDecodeError:
        pass
    jsoncloser.close()
    return osselotjson

name = 'package-analysis/analysed-packages/bash/version-5.1.16/bash-5.1.16.spdx.json'
bash = readOSSelot(name)
if bash != '':
    print(bash['packages'][0]['name'])

name = 'package-analysis/analysed-packages/linux/version-6.0/linux-6.0.spdx.json'
linux = readOSSelot(name)
if linux != '':
    print(linux['packages'][0]['name'])

Run the script:

$ readOSSelot.py 
bash-5.1.16.tar.gz
linux-6.0.tar.xz

In both cases, the correct file was opened and its contents returned, even though in the second case the name of a non-existent file was specified.

Example Web server access with a Python script

#!/usr/bin/env python3
import json
from urllib import request
import gzip

def fetchOSSelot(url):
    try:
        osselot = request.urlopen(url)
        osselotdata = osselot.read()
    except:
        try:
            osselotgz = request.urlopen(url + '.gz')
            osselotgzdata = osselotgz.read()
            osselotdata = gzip.decompress(osselotgzdata)
        except:
            return ''
    try:
        osselotjson = json.loads(osselotdata)
        return osselotjson
    except json.decoder.JSONDecodeError:
        return ''

url = 'https://github.com/Open-Source-Compliance/package-analysis/raw/refs/heads/main/analysed-packages/bash/version-5.1.16/bash-5.1.16.spdx.json'
bash = fetchOSSelot(url)
if bash != '':
    print(bash['packages'][0]['name'])

url = 'https://github.com/Open-Source-Compliance/package-analysis/raw/refs/heads/main/analysed-packages/linux/version-6.0/linux-6.0.spdx.json'
linux = fetchOSSelot(url)
if linux != '':
    print(linux['packages'][0]['name'])

Run the script:

$ fetschOSSelot.py 
bash-5.1.16.tar.gz
linux-6.0.tar.xz

As above, but here when accessing the file via network, the correct file was downloaded and its contents returned, even though in the second case the name of a non-existent file was specified.