<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<base href="http://blogs.loc.gov/digitalpreservation/feed/"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1043208734;
mso-list-template-ids:1314934018;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:1.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:"Courier New";
mso-bidi-font-family:"Times New Roman";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:1.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:2.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:2.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:3.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:3.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:4.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:4.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">May be of interest! From the Library of Congress.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">---<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">Janet Carleton| Digital Initiatives Coordinator | Digital Initiatives | Mahn Center for Archives and Special Collections, Preservation & Digital Initiatives | OHIO University Libraries | Alden
333 | Athens, Ohio | 740.597.2527 | </span><span style="color:black"><a href="mailto:carleton@ohio.edu"><span style="font-size:10.0pt;color:black">carleton@ohio.edu</span></a></span><span style="font-size:10.0pt;color:black"> |
</span><span style="color:black"><a href="https://media.library.ohio.edu/"><span style="font-size:10.0pt;color:black">https://media.library.ohio.edu</span></a></span><span style="font-size:10.0pt;color:black"> | she/her/hers<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>Feed:</b> The Signal<br>
<b>Posted on:</b> Thursday, June 29, 2023 10:10 AM<br>
<b>Author:</b> Liz Holdzkom<br>
<b>Subject:</b> Filling in the File Format Gaps<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<table class="MsoNormalTable" border="0" cellspacing="3" cellpadding="0">
<tbody>
<tr>
<td style="padding:.75pt .75pt .75pt .75pt">
<p><em><span style="font-family:"Calibri",sans-serif">Today’s guest post is from <b>Kate Murray</b>, <b>Marcus Nappier</b>, and <b>Liz Holdzkom</b> of the Digital Collections Management & Services Division at the Library of Congress.</span></em><o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" align="center">
</div>
<p>This is the fourth installment of our semi-annual blog series about file format research for the
<a href="https://www.loc.gov/preservation/digital/formats/index.html?loclr=blogsig">
Sustainability of Digital Formats: Planning for Library of Congress Collections</a> at the Library of Congress. If you’re a file format fan, take a look at the other entries
<a href="http://blogs.loc.gov/thesignal/2021/12/fun-with-file-formats/?loclr=blogsig">
Fun with File Formats</a>, <a href="http://blogs.loc.gov/thesignal/2022/06/return-to-the-fascinating-world-of-file-formats/?loclr=blogsig">
Return to the Fascinating World of File Formats!</a>, and <a href="http://blogs.loc.gov/thesignal/2022/12/even-more-fun-with-file-formats/?loclr=blogsig">
Even More Fun with File Formats!</a>. We may not have the most creative blog post titles but we know our way around a specification and how to find a
<a href="https://www.garykessler.net/library/file_sigs.html">magic number</a>.<o:p></o:p></p>
<p>This has been a busy few months for your favorite file format folks! Let’s catch you up on all the goings on.<o:p></o:p></p>
<p><strong><span style="font-family:"Calibri",sans-serif">New and updated file format descriptions (and LOTS of them)</span></strong><o:p></o:p></p>
<p>Thanks in part to a contract with NVision Solutions, we have published 30 new file format descriptions (known as FDDs) to our site this calendar year. A full list of the new entries is available on our
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd_workplan.shtml#2022-2023?loclr=blogsig">
2022-2023 workplan</a> and we’re also keeping our <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd_workplan.shtml#pub-log?loclr=blogsig">
publication log</a> up-to-date so you can follow along at home when we publish a new one.<o:p></o:p></p>
<p>These new FDDs fall into several content categories:<o:p></o:p></p>
<ul type="disc">
<li class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1">
<strong><span style="font-family:"Calibri",sans-serif">Accessibility support formats</span></strong> which includes both formats for screen readers/audio players as well as formats for captions and subtitles in audiovisual content. A few highlights include
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000553.shtml?loclr=blogsig">
BRF (Braille Ready Format)</a> (FDD 551), <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000551.shtml?loclr=blogsig">
HBL (Braille Sense Format File)</a> (FDD 553), <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000567.shtml?loclr=blogsig">
WebVTT (Web Video Text Tracks Format)</a> (FDD 567), <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000569.shtml?loclr=blogsig">
SRT (SubRip Subtitle Format)</a> (FDD 569) and <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000571.shtml?loclr=blogsig">
SUB (VobSub Subtitle Format)</a> (FDD 571). This focus on accessibility is linked to related projects in the Federal Agencies Digital Guidelines Initiative (FADGI)’s AudioVisual Working Group’s
<a href="https://www.digitizationguidelines.gov/guidelines/accessibilty_AV_collections.html">
Accessibility Subgroup</a>. An additional FADGI project reflects research into <a href="https://blogs.loc.gov/thesignal/2023/05/new-fadgi-project-researching-accessibility-in-open-source-digital-preservation-applications/?loclr=blogsig">
accessibility for open-source digital preservation applications</a>.<o:p></o:p></li><li class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1">
<strong><span style="font-family:"Calibri",sans-serif">3D, Virtual Reality and related design formats</span></strong> support preferences in the
<a href="https://www.loc.gov/preservation/resources/rfs/design3D.html?loclr=blogsig">
Recommended Formats Statement</a> as well as other efforts in the Library. A few new entries to note include
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000557.shtml?loclr=blogsig">
3MF (3D Manufacturing Format)</a> (FDD 557), <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000563.shtml?loclr=blogsig">
E57 (ASTM E57 3D file format)</a> (FDD 563), <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000564.shtml?loclr=blogsig">
VRM</a> (FDD 564) and <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000556.shtml?loclr=blogsig">
ARML 2.0 (Augmented Reality Markup Language)</a> (FDD 556).<o:p></o:p></li><li class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1">
<strong><span style="font-family:"Calibri",sans-serif">Web-enabled format entries</span></strong> include
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000577.shtml?loclr=blogsig">
WebP</a> (FDD 577), <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000578.shtml?loclr=blogsig">
VP8 Video Codec</a> (FDD 578) and <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000579.shtml?loclr=blogsig">
VP9 Video Codec</a> (FDD 579). <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000586.shtml?loclr=blogsig">
WACZ (Web Archive Collection Zipped)</a> (FDD 586) for <strong><span style="font-family:"Calibri",sans-serif">web archiving</span></strong> is also included on the
<a href="https://www.loc.gov/preservation/resources/rfs/webarchives.html?loclr=blogsig">
Recommended Formats Statement</a>.<o:p></o:p></li></ul>
<p class="MsoNormal"><a href="http://blogs.loc.gov/thesignal/files/2023/06/Figure1-PubLog.png"><span style="text-decoration:none"><img border="0" width="1024" height="534" style="width:10.6666in;height:5.5583in" id="_x0000_i1027" src="http://blogs.loc.gov/thesignal/files/2023/06/Figure1-PubLog-1024x534.png" alt="Screenshot of spreadsheet showing newest Format Description Documents (order from newest to oldest), with FDD numbers, names, URLs, and publication dates."></span></a><em><span style="font-family:"Calibri",sans-serif">Formats
publication log for new additions from January – June 2023. For the live version, see
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd_workplan.shtml?loclr=blogsig">
www.loc.gov/preservation/digital/formats/fdd/fdd_workplan.shtml</a>.</span></em> <o:p>
</o:p></p>
<p><strong><span style="font-family:"Calibri",sans-serif">RFS FDD prioritization</span></strong><o:p></o:p></p>
<p>Let’s keep the FDD update train rolling! In preparation of the release of the <a href="https://www.loc.gov/preservation/resources/rfs/?loclr=blogsig">
2023 Recommended Formats Statement (RFS)</a>, we’ve also been updating the FDDs called out in the RFS’s various content categories. You may remember in our
<a href="http://blogs.loc.gov/thesignal/2022/06/return-to-the-fascinating-world-of-file-formats/?loclr=blogsig">
Return to the Fascinating World of File Formats!</a> blog post from last June, we developed a new process to pull the date of last update from our FDD xml to target those RFS FDDs. We’ve continued to build on this work and standardize the process to update
these FDDs. “What information are we updating?” is probably a question you’re asking right now. We’re sure by now you’ve checked out an FDD or two and noticed LOTS of links to external resources. That’s where we start with our updates to ensure that links
are still active and resolve to the correct source. We’ve now also developed template language for the “LC Experience” and “LC Preference” sections in our FDDs to better clarify the Library’s holdings of a particular format or whether that format is listed
in the RFS. The clarity in the “LC Preference” field is important because we haven’t always been consistent in the past and it’s caused a few (or many) headaches when running our XML parsing script. We’re continuing to work on establishing consistency in that
field to save ourselves from future headaches.<o:p></o:p></p>
<p>Unlike last year, we actually had a priority one FDD from our prioritization list!
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000586.shtml?loclr=blogsig">
WACZ (Web Archive Collection Zipped)</a> as mentioned above is a brand new FDD in the RFS. We still prioritized FDDs that were listed as a preferred or acceptable format without a significant update for 5-10 years but also reviewed newer FDDs as well. With
over 50 completed FDD updates, we continue to see the high value of this work and it will remain a critical part of our yearly review.<o:p></o:p></p>
<p>The 2023-2024 version of the RFS will be published in the coming weeks so stay tuned for a follow up blog post highlighting all the changes.<o:p></o:p></p>
<p><strong><span style="font-family:"Calibri",sans-serif">Upcoming work</span></strong><o:p></o:p></p>
<p>We are excited to begin a new contract this month with Ashley Blewer, Abi Simkovic and Frances Harrell through Myriad Consulting. Over the next 12 months, this team will research and write close to 40 new FDDs. The
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd_workplan.shtml#2023-2024?loclr=blogsig">
2023-2024 work plan</a> is available and includes a few new areas of interest such as
<strong><span style="font-family:"Calibri",sans-serif">mobile device support, packaging, software and installation support, forensics and disc imaging</span></strong> as well as filling in gaps for existing content categories
<a href="https://www.loc.gov/preservation/digital/formats/fdd/email_fdd.shtml?loclr=blogsig">
Email and Personal Information Manager (PIM) Formats</a>, <a href="https://www.loc.gov/preservation/digital/formats/fdd/design3D_fdd.shtml?loclr=blogsig">
Design and 3D</a>, <a href="https://www.loc.gov/preservation/digital/formats/fdd/dataset_fdd.shtml?loclr=blogsig">
Datasets and Databases</a>, <a href="https://www.loc.gov/preservation/digital/formats/fdd/still_fdd.shtml?loclr=blogsig">
Still Images</a> and <a href="https://www.loc.gov/preservation/digital/formats/fdd/text_fdd.shtml?loclr=blogsig">
Text</a>. We’re personally looking forward to the research work on <a href="https://adm.ebu.io/">
Audio Definition Model (ADM)</a>, <a href="https://formats.kaitai.io/gzip/#:~:text=Gzip%20is%20a%20popular%20and,by%20a%20chosen%20compression%20algorithm">
gzip</a>, <a href="https://github.com/dsnet/compress/blob/master/doc/bzip2-format.pdf">
bzip</a>, and <a href="https://support.apple.com/en-us/HT211965">Apple ProRaw</a> just to name a few.<o:p></o:p></p>
<p>We’ve discussed how we prioritize which formats to work on in a <a href="http://blogs.loc.gov/thesignal/2021/12/fun-with-file-formats/?loclr=blogsig">
previous blog post</a>. More specifically for this upcoming group of FDDs, priority formats were identified via the Library’s
<a href="https://www.loc.gov/rr/perform/?loclr=blogsig">Music</a> and <a href="https://www.loc.gov/rr/mss/?loclr=blogsig">
Manuscript</a> divisions’ research efforts and holdings, inclusion in projects such as BitCurator (the Library of Congress is a member of the
<a href="https://bitcuratorconsortium.org/about/">BitCurator Consortium</a>) and wider community discussion.<o:p></o:p></p>
<p><strong><span style="font-family:"Calibri",sans-serif">Fan favorite formats</span></strong><o:p></o:p></p>
<p>But it’s not just all about the new FDDs, so let’s look at the old favorites. We looked at the analytics from the last 12 months, and found that
<a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000323.shtml?loclr=blogsig">
CSV</a> is our most popular FDD, followed closely by <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000508.shtml?loclr=blogsig">
Wavefront Material Template Library (MTL)</a>. We love a good CSV so this makes sense.<o:p></o:p></p>
<p class="MsoNormal"><a href="http://blogs.loc.gov/thesignal/files/2023/06/Figure2-CSV.png"><span style="text-decoration:none"><img border="0" width="1024" height="447" style="width:10.6666in;height:4.6583in" id="_x0000_i1026" src="http://blogs.loc.gov/thesignal/files/2023/06/Figure2-CSV-1024x447.png" alt="Screenshot of CSV Format Description Document"></span></a><em><span style="font-family:"Calibri",sans-serif">A
snippet of everyone’s favorite FDD, CSV Comma Separated Values (RFC 4180)! See <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000323.shtml?loclr=blogsig">
www.loc.gov/preservation/digital/formats/fdd/fdd000323.shtml</a> for the full version.</span></em>
<o:p></o:p></p>
<p>Then <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000445.shtml?loclr=blogsig">
DWG (AutoCAD Drawing) Format Family</a> and <a href="https://www.loc.gov/preservation/digital/formats/fdd/fdd000388.shtml?loclr=blogsig">
Email (Electronic Mail Format)</a> come in third and fourth but with a lot less views than our top two (we’re talking thousands).<o:p></o:p></p>
<p>And more stats we can love: Thousands of visitors came to our site over the past year from The Signal blog and blog posts just like this! And Wikipedia is also a major referring site, which means Wikipedians are using our FDDs for source material. No matter
where you are coming from, whether you are linking from a different site or coming to us directly, we love our visitors just the same.<o:p></o:p></p>
<p>As always, comments and feedback is very welcome! Leave a comment here or send us a note at
<a href="mailto:formats@loc.gov">formats@loc.gov</a>.<o:p></o:p></p>
</td>
</tr>
</tbody>
</table>
<p><br>
<a href="https://blogs.loc.gov/thesignal/2023/06/filling-in-the-file-format-gaps/">View article...</a><o:p></o:p></p>
</div>
</body>
</html>