Extract information from a PDF invoice
Thanks to the Invoice Extract product powered by AI, you can convert a PDF invoice to a FatturaPA XML invoice.
This is useful in Italy because of the requirement to transform any cross border invoice received in Italy from a foreign supplier, into a valid FatturaPA XML self-invoice. The document type in this case must be one of:
TD17 integrazione/autofattura per acquisto servizi dall'estero
TD18 integrazione per acquisto di beni intracomunitari
TD19 integrazione/autofattura per acquisto di beni ex art.17 c.2 DPR 633/72
Ask to get access
Write to business@a-cube.io to activate the product in sandbox and to get a quotation.
Note: the sandbox environment is limited by default to 10 document conversions.
Operations flow
-
POST /invoice-extract
to provide the PDF file.
Example
curl --location 'https://api-sandbox.acubeapi.com/invoice-extract' \ --header 'Authorization: Bearer YOUR TOKEN HERE' \ --form 'file=@"/path/to/file.pdf"'
You will get a JSON response with the job details
{ "uuid": "unique job identifier", "acquisition_date": "date time", "filename": "string", "job_status": "waiting|success|error", "pages": null }
-
GET /invoice-extract/{uuid}
to get the status of the job.
When the field
job_status
switch tosuccess
then you can proceed to obtain the XML result. -
GET /invoice-extract/{uuid}/result
to obtain the XML file.
Note: you can get both XML or JSON invoice formats.
The
Accept
header must be set toapplication/xml
oraplication/json
Options
The POST /invoice-extract API accept optionally a configuration JSON.
The form field name to be sent is conversion_configuration
.
The configuration JSON can contain:
- default vat rate: if not provided, by default the XML will have 22% vat rate applied
- convert_amounts: if set to false you will get prices in the original currency of the invoice
Example
curl --location 'https://api-sandbox.acubeapi.com/invoice-extract' \
--header 'Authorization: Bearer YOUR TOKEN HERE' \
--form 'file=@"/path/to/file.pdf"' \
--form 'conversion_configuration="{\"default_vat_rate\":0,\"convert_amounts\":false}"'
Which information the converted XML will contain
- The AI algorithm try to extract supplier and customer information. In case the vat number is found, the BusinessRegistry database will be used as data fallback.
- Geocoding functionalities split the addresses into the required fields.
- Invoice number and invoice date
- The AI algorithm extracts the invoice detail lines and these will be converted into detail lines of the XML invoice.
-
The
DatiRiepilogo
fields will be generated automatically based on the detail lines that was found. - Prices will be converted into the EUR currency using the right exchange rate taken from Bank of Italy webservices for the right date (the invoice date).
- Description texts are clean-up converting special chars to latin1 entities
Which information the converted XML will not contain
The document type (TipoDocumento
): in example, in case of self-invoice it can be one of TD17
, TD18
, TD19
depending on the type of the sold items (goods, services).
Choosing the right type of document is currently not an ability of the Invoice Extract algorithms.