Packages

package util

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. Protected

Type Members

  1. class SerializableHadoopConf extends Serializable

    Hadoop Configuration wrapper safe to serialize into a Spark closure or broadcast.

Value Members

  1. object Asn1Indexer

    Builds and reads a sidecar record-offset index for BER/DER files, enabling Spark to split large files across multiple tasks.

    Builds and reads a sidecar record-offset index for BER/DER files, enabling Spark to split large files across multiple tasks.

    Index file format (<original>.asn1idx):

    • 8-byte header: 7-byte magic "ASN1IDX" + 1-byte version (0x01)
    • 8 bytes per record: big-endian Long byte offset of the record's first tag byte

    Only definite-length BER (and all DER) can be indexed. Indefinite-length constructions stop the scan early and produce a partial index.

    Reading efficiency

    readIndexSlice uses HDFS positioned reads (pread) to binary-search the index file for the split boundaries, then reads only the matching slice sequentially. For a 100 M-record index (~800 MB), this costs ~27 pread round-trips per task plus a small sequential read — the full index is never loaded into memory.

  2. object Asn1Inspector

    Lightweight diagnostic utility — decode the first few records from a local ASN.1 file without a SparkSession.

    Lightweight diagnostic utility — decode the first few records from a local ASN.1 file without a SparkSession.

    Typical use: paste into a notebook or sbt console to verify your options before submitting a full Spark job.

    import io.github.sparkasn1.spark.asn1.util.Asn1Inspector
    Asn1Inspector.peek(
      schemaPaths = Seq("/tmp/cdr.asn1"),
      typeName    = "PGWRecord",
      encoding    = "ber",
      filePath    = "/tmp/sample.ber"
    )

    Can also be run from the command line:

    sbt "runMain io.github.sparkasn1.spark.asn1.util.Asn1Inspector \
      --schema cdr.asn1 --type PGWRecord --encoding ber --file sample.ber"
  3. object BerRealUtil

    BER/DER encoding of ASN.1 REAL (X.690 §8.5) without BouncyCastle support.

    BER/DER encoding of ASN.1 REAL (X.690 §8.5) without BouncyCastle support.

    Only the binary encoding (base-2) is produced. Special values +∞/-∞ use their standardised single-byte representations. Zero maps to empty content.

  4. object BitUtils

    Bit-level utilities shared by PER encoder/decoder (Phase 2).

  5. object SchemaCache

    Executor-local cache of parsed SchemaRegistry instances.

    Executor-local cache of parsed SchemaRegistry instances.

    Keyed on the set of (path, lastModified) pairs so that schema file changes are detected between jobs without restarting executors.

    Schema files must be accessible from every executor node — either on HDFS, S3, or distributed via --files / SparkContext.addFile.

Ungrouped