The Giant List of Document File Types and Extensions
If you work with digital documents, you probably recognize a handful of document file types or file extension types. File extensions are the letters that follow the period in file names. You’re probably familiar with DOCX, PDF, and a few others.
Most of the time, you’ll only notice a document’s file type when you encounter an error. You may need to make changes to a document sent to you and find that it is a read-only PDF. Sometimes opening or saving changes to other documents may result in a file format error.
Most document file types indicate their functions and limitations in their names. Knowing what these are can save you time and frustration trying to manage documents in your office. Here are the most commonly used document file types and steps to use them.
- Document file types and extensions are suffixes on filenames that tell users what kind of information to expect in a file and what kind of applications they can use to open a file.
- Developers have used dozens of document file types in the history of electronic document storage but you’re likely to encounter fewer than ten of these in current use.
- Some document file types are proprietary releases by developers such as Microsoft, some are open-source, and others are written in markup languages for rendering information in web browsers.
What are Document File Types and Extensions?
Document file types or extensions – also called filename extensions – refer to the 2-4 letter suffixes appended to a filename after a period. Filename extensions are a class of metadata, and they contain information about how data is stored in a file and how it is used.
Developers originally used filename extensions to indicate catalog or index information to other users such as TXT for plain text, MUS for music, and GFX for graphics. Over time, developers began writing programs with multimedia functionality capable of handling a variety of data types. As programs became more complex and operating systems multiplied, developers shifted towards using filename extensions to indicate associated programs or functions.
With regard to documents, filename extensions typically tell users what kind of programs will be compatible with the file, whether it is a markup language for use in a web browser and whether it may have read-only functionality.
8 Document File Types and Extensions
Although developers have used more than two dozen document filename extensions in the past, more than half of these have fallen out of use. Currently, there are eight types you’re likely to encounter in the course of office document management.
1. DOC and DOCX
DOC – short for document – is a Microsoft proprietary filename extension for storing documents in Microsoft Word Binary File Format. Although DOC is native to Microsoft Word, other common word processors such as Apple Pages and AbiWord can create, read, and edit DOC files.
In 2008, Microsoft replaced DOC with DOCX (Office Open XML or OOXML) as the default Word file format. DOCX is an XML-based file type capable of storing spreadsheets, charts, and various multimedia data in addition to documents. Because it stores data in a markup language (XML), DOCX has high interoperability with applications and browsers.
Non-Microsoft products that contain support for DOCX include:
- MacOS TextEdit
- Google Docs
HTML stands for HyperText Markup Language. Markup languages control how data stored in files appear to end-users. HTML is the universal standard markup language for web browsers. Wherever you read text on the web, what you read, and what you see – the spaces, the paragraph breaks, the location of images and buttons – are renderings of HTML.
HTML files consist of elements marked by opening and closing tags. While you can open HTML files in word processors such as Microsoft Word and even save Word documents as HTML files, HTML is not an ideal format for editing and saving documents locally. In fact, even if you have HTML files saved on your computer, they will open in your web browser by default – the ideal place for them to be viewed
ODT stands for OpenDocument Text. OpenDocument files are XML-based open-source file formats. OpenDocument file types include text (ODT), graphics (ODG), spreadsheets (ODS), and presentations (ODP).
ODT is the open-source counterpart of DOCX. Open-source formats like OpenDocument have no proprietary owners. As such, users have guaranteed long-term access to data stored in OpenDocument files regardless of future legal or application changes.
By nature, ODT files can be viewed and edited in any modern Office application, including Microsoft Word, WordPerfect, and OpenOffice.
PDF stands for Portable Document Format. The emphasis here is on portable. Once upon a time, if you wanted to pass a document along to someone else, they had to have the same software you used to create the file. As software options proliferated, this became more and more challenging.
PDF severs the tie between a document and the program that created it. A PDF is a print-ready document that will look the same to everyone, irrespective of what software was used to create the document.
PDFs render both text and images in a page description language called PostScript. Because PDFs treat text, images, and other objects as instances of the same kind of data, PDFs are the most software- and application-independent file type for documents.
PDF formatting contains several useful digital asset management features:
- Digital signatures
- File attachments
- Video embedding
Support for these features – particularly encryption and password protection – makes PDFs the optimal file type for distributing read-only documents.
5. XLS and XLSX
XLS and XLSX are Microsoft’s proprietary spreadsheet file types. XLSXs are XML-based and are the current default file type in Microsoft Excel, having replaced XLSs in 2008 when DOCX replaced DOC.
Non-Microsoft products that contain support for XLSX include:
- Apple Numbers
- Quattro Pro
- Google Sheets
6. PPT and PPTX
PPT and PPTX are Microsoft’s proprietary presentation file types. PPTXs are XML-based and are the current default file type in Microsoft PowerPoint, having replaced PPTs in 2008 when DOCX replaced DOC.
Non-Microsoft products that contain support for PPTX include:
- Apple Keynote
- Google Slides
ZIP is an archive file format for data compression. ZIP came into popularity when the internet was still in its infancy and file transfers took, sometimes, hours. Clever programmers found that machines can store and read certain kinds of information much more efficiently than human users. For this reason, it is often possible to compress the data stored in a file into a smaller quantity both for storage and for faster transmission. Additionally, they found that multiple files could be combined into a single compressed file, even while maintaining the folder structure. Thus ZIP files became the de facto way to share files. Even today ZIPs are the most common data compression file type used for sharing and archiving documents.
TXT files are the simplest document filename extension. They contain only sequences of characters and cannot code information such as fonts, colors, and other styling effects. All operating systems and word processing applications can interpret TXT files.
Powerful and Friendly Document Management with FileCenter
Small businesses need document management solutions too. FileCenter offers businesses of all sizes an affordable, fully featured document management platform that seamlessly integrates with the applications you already use.
To learn more and download a demo, visit FileCenter today.