Extract data from bills of lading
Bills of Lading are the title documents of international trade. They prove that the cargo was shipped, define who owns the right to claim it at the discharge port, and serve as the contract of carriage between the shipper and the carrier. Every container ship leaving Shanghai, Rotterdam, or Long Beach is shadowed by a stack of BoLs, each one tied to a specific load, a specific consignee, and a specific set of Incoterms that determine who pays for what when the cargo moves. Freight forwarders, customs brokers, and logistics teams at importers all need the same data: BoL number, vessel name and voyage, port of loading, port of discharge, shipper, consignee, notify party, cargo description, weight and volume, marks and numbers on the containers, and the freight terms (prepaid or collect). The hard parts are formatting drift across carriers. Maersk uses one layout, MSC another, Hapag-Lloyd a third, and house BoLs from forwarders depart from carrier standards entirely. Cargo descriptions are often free text spanning multiple lines, occasionally in two languages stacked vertically. Container numbers follow ISO 6346 (four letters plus seven digits plus a check digit) but the same BoL may list 1 container or 47. Seal numbers and marks-and-numbers fields are sometimes embedded inside the cargo description block rather than broken out. Letter-of-credit BoLs include endorsements on the back that change the legal title transfer. Telex-released BoLs replace the original-paper-document workflow with an electronic release and have to be flagged as such because customs treats them differently. Talonic extracts the full BoL structure regardless of carrier or layout. Every container, seal, and cargo line is captured. Incoterms are normalized to the ICC 2020 set (FOB, CIF, EXW, DDP, and the others). Multi-page BoLs with attached cargo manifests are stitched. A typical shipment from Shanghai to Long Beach issued on 2026-04-12 with $48,200 of declared value moves through US Customs in hours rather than days when the BoL data lands in a broker's filing system as structured JSON. Every extracted cell carries a confidence score and a pixel-region reference so customs brokers and L/C banks can audit any number against the source before clearing the shipment.
What gets extracted from bills of lading
How extraction works for bills of lading
BoLs originate from carrier IT systems, freight forwarder TMS platforms, and occasionally from buyer-side ERP modules. Talonic classifies each BoL (ocean, inland, house, master) and runs it through the freight document schema in the Field Registry, which encodes shipper, consignee, vessel, ports, cargo descriptions, and Incoterms across carrier formats. Container numbers are validated against ISO 6346 check digits. Multi-page cargo manifests are stitched into the main BoL record. Weight units are normalized (KG, LB, MT) to a canonical numeric field plus an explicit unit. Every extracted cell is returned with a confidence score and pixel-region provenance under DIN SPEC 91491 conformity, so customs brokers, letter-of-credit banks, and shipping lines can audit any field against the source PDF before clearing the cargo.
Sample extraction
A 2-page ocean Bill of Lading from Maersk, Shanghai to Los Angeles
{
"bol_number": "MAEU245789213",
"issue_date": "2026-04-12",
"shipper_name": "Shanghai Export Co., Ltd.",
"shipper_address": "Pudong District, Shanghai 200120, China",
"consignee_name": "Acme Imports LLC",
"consignee_address": "88 Harbor Blvd, Long Beach, CA 90802, USA",
"carrier_name": "Maersk Line",
"vessel_or_flight": "EVER GIVEN / 124E",
"port_of_loading": "CNSHA Shanghai",
"port_of_discharge": "USLAX Los Angeles",
"cargo_items_description": "12 pallets industrial machine parts, HS 8479.89",
"cargo_items_quantity": 12,
"cargo_items_gross_weight": 4820,
"cargo_items_marks_and_numbers": "MAEU 7654321 / Seal 9988",
"incoterms": "FOB Shanghai",
"freight_terms": "Freight Prepaid"
}Frequently asked
Does it handle both ocean and inland Bills of Lading?
Yes. Ocean BoLs, inland BoLs, sea waybills, multimodal transport documents, and combined transport bills all run through the same schema. Mode-specific fields (vessel for ocean, truck plate for inland) are populated when present and left null when not applicable.
How are container numbers validated?
Container numbers follow ISO 6346 (four letters identifying the owner plus seven digits plus a check digit). Talonic validates the check digit during extraction; mismatches are flagged in the confidence output so customs brokers can verify against the physical container.
What about house BoLs versus master BoLs?
Both extract cleanly. A house BoL issued by a freight forwarder references a master BoL issued by the actual carrier; Talonic captures the linking reference numbers so brokers can match the two. The document_type field distinguishes house from master.
Are telex-released BoLs flagged?
Yes. When the source PDF carries a telex-release stamp or annotation, Talonic surfaces a release_method indicator so customs and the consignee know that physical surrender of an original BoL is not required for cargo release.
Can it process multi-language BoLs?
Yes. BoLs from non-English carriers often include Chinese, Korean, Japanese, Arabic, or other-script cargo descriptions alongside an English transcription. Talonic preserves both and treats the English version as canonical for downstream routing where one is available.
Ready to extract from your own bills of lading?
Author note
Reviewed by Talonic engineering, freight document review · last reviewed 2026-05-13