rfc_metadata_in_the_v3_era
This is an old revision of the document!
Table of Contents
v3 metadata json
JM - tags are not working properly for this page
Notes on RFC metadata in the v3 era
- New XSD is in place (10 Sept 2019): https://www.rfc-editor.org/in-notes/rfc-index.xsd
- rfc-index.txt, .html, .xml (and similar files) are slightly different from before, this is due to:
- added new publication formats and source format.
- removed file sizes. (Byte count is no longer included as metadata for each RFC.)
- There is only one page count per RFC, even if it is available in multiple file formats.
- As a result, in rfc-index.xml, page-count has been “pulled up” a level.
- For page count, source of data will change.
- For pre-v3 docs, it is the page count of the .txt file (except for a handful of old RFCs where there is no .txt file).
- For v3 docs, it will be the page count of the PDF file.
PDF naming conventions
- Externally, PDFs are listed simply as “PDF” (whether v3 or otherwise).
- Internally, the db holds “v3PDF” to mean v3 output; “PDF” is used for pre-v3.
- As before, .txt.pdf files are not listed in index files.
TEXT naming conventions
- Externally, .txt format (whether pre-v3 or not) is listed as “TEXT”.
Exception: rfc-index.xml: will display “ASCII” (for pre-v3) and “TEXT” (afterwards). Rationale: other index files were already using the “TEXT” instead of “ASCII”, so they weren't changed to start differentiating. - Internally, the db holds “TEXT” to mean v3 output; “ASCII” is used for pre-v3.
New resource: JSON files of RFC metadata
- One JSON file per RFC was made available 11 September 2019.
- Each file contains data similar to an entry in rfc-index.xml.
- Sample file: https://www.rfc-editor.org/rfc/rfc5234.json
Examples
Example: RFC4254
-- OLD
<format>
<file-format>ASCII</file-format>
<char-count>50338</char-count>
<page-count>24</page-count>
</format>
<format>
<file-format>HTML</file-format>
</format>
-- NEW
<format>
<file-format>ASCII</file-format>
<file-format>HTML</file-format>
</format>
<page-count>24</page-count>
For comparison, the JSON record includes (among other data):
"format":["ASCII","HTML"],"page_count":"24"
Example: RFC8888 (v3 era)
-- NEW
<format>
<file-format>TEXT</file-format>
<file-format>HTML</file-format>
<file-format>PDF</file-format>
<file-format>XML</file-format>
</format>
<page-count>48</page-count>
For comparison, the JSON record would include (among other data):
"format":["TEXT","HTML","PDF","XML"],"page_count":"48"
rfc_metadata_in_the_v3_era.1676396675.txt.gz · Last modified: (external edit)
