The Principles of Reading Order: A Research Program

Abstract: Much of education may be reduced to a sequencing problem: Given a set of books, in what order should they be read? This question admits of surprisingly rigorous treatment. We can identify several distinct types of dependency between books—topical, historical, procedural, motivational, and supplemental—and state principles governing each. These principles in turn suggest a practical research program: a collaborative, self-constructing system in which experts and well-read contributors record and confirm inter-book dependencies, producing an increasingly rich and useful ordering of books. Such a system would be of immediate practical value to curriculum designers, researchers, homeschoolers, and readers of all sorts, and of theoretical interest to anyone who thinks carefully about books, the knowledge they contain, and what it means to be well-educated.

Table of Contents

1. A general problem in education theory
2. Examples motivating the principles
3. The principles of reading order
4. A self-constructing system

1. A general problem in education theory

Here is an interesting formulation of a problem for education theory.

We may think of much of learning as consisting of reading books carefully. Obviously, if we know we want to get through a certain set of books, we may put them in an order; some orders will result in much better educational outcomes than others. To a certain extent, we may reduce the problem of getting an efficient education to the problem of deciding what books (and articles and perhaps other media such as film) should be studied in what order. Moreover, we may identify a number of fairly clear principles according to which the books may be ordered. Finally, we may make a programmatic claim to the effect that once you have decided that your education will involve reading just a few books or learning just a few subjects, the dependencies are so numerous and so much overlapping that you can identify and confirm, often repeatedly, that many books and topics must be part of any “well-rounded” education.

What unifies all the components of this system is the fact that they provide a practical research program for those interested in making recommendations of books and subjects of study. Such a system would be of extreme interest to, for example, homeschoolers, who face the sequencing problem in its purest form: they have no institutional curriculum handed to them, they are choosing from the full space of available books, and they are making ordering decisions constantly with limited expert guidance.

2. Examples motivating the principles

If Russell’s Problems of Philosophy and Wittgenstein’s Philosophical Investigations are about the same topic, and the former is more basic and the latter is more advanced, then it is better to read the former before the latter. This is very basic, but it appears to me to be a powerful ordering principle.

Similarly, if Philosophical Investigations and The Blue and Brown Books are greatly overlapping, it is best to read one but not both (unless one is a specialist).

Another ordering principle is historical dependency. The Constitution and Bill of Rights of the United States are essential to understanding the Federalist Papers, not just because the former are more basic and general, but because the latter was written as a response to (and defense of) the former. Historical dependency is especially important in all branches of history, but maybe especially the history of ideas: Some important works of philosophy, theology, law, etc., simply cannot be properly appreciated until one has read some other book to which they are responding.

Some books are prerequisites not (or not mainly) because of a topical dependence, but because of procedural dependence. That is, the first teaches a skill needed to understand the second. Logic texts inculcate the ability to read densely-argued proofs; mathematics texts teach the ability to solve problems found in books in science and engineering; language tutorials give the ability to read books in a target foreign language.

Sometimes, it can be said that a person is adequately prepared by one book to read another book, but this is not true of another person. For example, a few readers may immediately dive into Aristotle’s Physics and Metaphysics after reading a general introduction to Greek philosophy. Many, however, might greatly benefit from reading a general introduction to Aristotle first. We may then define a supplementary prerequisite text as one that might be of use to some, but not all, readers.

It can also be said that reading some books greatly increases the interest in other books. Perhaps reading Orwell’s 1984, and some other easy works of fiction, would increase both the interest in and the amount one might learn from the Communist Manifesto, Mein Kampf, and The Road to Serfdom.

Needless to say, perhaps, any pair of books could be said to have more than one dependency. For example, Euclid’s Elements is prior to Newton’s Principia topically (geometry is more basic than mechanics), procedurally (Euclid introduced and taught the world the method of exposition that Newton followed), and historically (Newton actually wrote his work in part modeled after Euclid’s).

Each of the kinds of priority may be quantified as more or less. The Fellowship of the Ring has a topical priority to The Two Towers of a degree that is very high; the latter basically requires the former. This is not the case for, say, Locke’s Essay Concerning Human Understanding and Hume’s Enquiry Concerning Human Understanding. Reading the former of these helps to understand the latter, but it is not required to the same degree.

Many orderings may be discovered by interviewing specialists. Any economist can tell you that Hayek’s The Road to Serfdom is much easier than Von Mises’ Human Action. Less obvious are cross-topic dependencies, but experts, too, can help, by telling you for example that it would be useful to study introductory symbolic logic before studying discrete mathematics.

We can tentatively operationalize the importance of a book, then, by gauging how many other books (and topics) are thought to depend on it. Yet “importance” is also a feature that experts may gauge directly. It is an interesting empirical question whether a book’s dependence score is closely tracked by expert estimate of importance.

Orderings between texts can also be discovered by objective analysis of the topics found in the texts. We may read Tacitus’ Annals and observe that it makes mention of Caesar, the Senate, the Empire, and various Roman provinces; these are examples of prerequisite topics. One might, then, say that a general history of the world, or of ancient history or of Roman history, would introduce these topics adequately to make the Annals clearer; such might then count adequately as prerequisite texts, but we ascertain this based on the topics, rather than by asking experts. Moreover, expert opinions of dependency surely can be proved or disproved by reference to the topics in books.

Moreover, there is an order between broad topics (call them subjects) that the books cover. Linguistics, as a discipline or subject of study, depends on some study of grammar or foreign language. Similarly, within a discipline, we may often identify subdisciplines or topics and place them in an order. We can put subjects at these various levels of granularity in an order; and insofar as subjects may be learned by reading books about them, we can use the various orders of subjects to put books into order.

That said, I might want to make only books (and perhaps other media) nodes in the system. While a system might track disciplines and topics and their dependence relations, these strike me as being far more vague and ultimately less useful than the dependence relations between specific books (or, perhaps genres of books, when we are talking about such standard types as “introductory college chemistry textbooks”).

3. The principles of reading order

The principles of reading order can be stated fairly rigorously. Given books x, y, and z,

Priority and dependence. If x ought to be read before y, then we say that x is prior to y, and y is dependent on x. For any x and y, they might stand in various priority relations. Each of 2–6 states a different priority/dependence relation; each specifies a different type of reason for thinking that x ought to be read before y.
Topical priority. If x is more basic than y (or y is more advanced), then we say x is topically prior to y. We may define ‘more basic’ this way: If having read x will substantially improve the comprehension of y because topics in y are introduced in x in a way that assumes less knowledge of those topics, then x is more basic than y.
Historical priority. If y was written as a response to, elaboration of, defense of, or attack upon x, then x is prior to y historically.
Procedural priority. If x teaches a skill or method (e.g., reading proofs, solving equations, reading a foreign language) that is needed to understand y, then x is prior to y procedurally.
Motivational priority. If reading x is likely to increase the reader’s interest in or receptivity to y, then x is motivationally prior to y.
Supplemental priority. If most readers may proceed directly from x to z, but some readers would benefit from reading y between them, then y is supplementally prior to z (relative to x).
Unrelatedness. If reading x is unlikely to substantially improve the comprehension or appreciation of y, then x is unrelated to y.
Redundancy. If x and y substantially overlap in content and neither is clearly superior, then it suffices to read one but not both (unless one is a specialist in the relevant subject). We may call x and y redundant.
Importance. The importance of x may be operationalized as the number of other books that depend on it (by any of the above relations). This derived measure may be checked against expert estimates of importance.
Degrees of priority. We may propose various rubrics according to which experts may give a numerical rating to the priority of a pair of books. Even without such granular data, useful generalizations could be drawn from sheer numbers of (binary) assertions of a given dependence.

We might also formulate principles of subject order, but this strikes me as being an ancillary problem, not being as much amenable to rigorous modeling as reading order.

4. A self-constructing system

It is not hard to imagine a new kind of collaborative knowledge resource that is built based on such principles. The basic idea is that we solicit data, consisting of answers to the above sorts of questions, from participants. This might well be interesting to participants, in the same way that it is fascinating to consider the various kinds of dependencies and priorities among books that one has read. The process of recording such data recapitulates much of the process of education. A system would have at least three parts: (1) information about participants; (2) a database of answers to questions; (3) search engines, summaries, bibliographies, etc., based on the data.

a. Participant info

We may also ask questions about the persons contributing: what are your degrees, and in what subjects? In what other areas have you done a substantial amount of reading—meaning, let us say, at least twenty books? We might include some process whereby we might use LLMs or other tools to confirm claims to expertise. Based on claims to expertise (or significant reading), an interview might pass immediately on to some core questions about the books themselves.

b. The main database

We can describe the main database as containing the answers a series of questions about books and the relationships between books (and perhaps the relationships between books and topics, and the relationships between topics). For example:

In your area of specialization (or competence) A, what are five books that are most important to have read and understood?
For book x which you have recommended, can you propose one or more other books that would make x easier to understand and appreciate?
Is book x topically prior to book y? (Or, to spell out the meaning: Will having read x substantially improve the comprehension of y because topics in y are introduced in x in a way that assumes less knowledge of those topics?)
Can you think of a book (or works in other media) that would significantly increase a reader’s interest in or receptivity to a given book?

And so forth. Contributors might simply confirm claims others have made with checkboxes; when not confirmed, or only partly confirmed, they might supply something else. The claimed relationship might be quantified as well, with participants putting numbers (according to rubrics) on the degree of some kind of inter-book dependence.

While LLMs could be used to supply some data, questions of book priority are precisely the sort for which human judgment contains subtleties that are not easily found in LLMs. But LLMs are capable of providing plausible answers to questions such as “Is Book A easier than Book B on the same topic?” or “What topics does this text presuppose?” Any such LLM-generated seed data should, however, be flagged as such and treated as provisional until confirmed by a human contributor with relevant expertise.

c. The front end

Based on the data put into the system and the extent to which claimed facts are confirmed by other users, the system could be essentially self-constructing. As different books are added to the system, each with at least one other book claimed to have a dependence relation, an increasingly rich ordering of books would appear. Similarly, as contributors endorsed (or rated) other dependence relations, the data would become more valuable for a variety of educational, library, and research purposes.

When enough data was in the system, users could explore the framework in various ways:

by searching on books (as nodes in a tree)
by searching on topics (as nodes in a tree, containing the most foundational, linked, or rated books)
by exploring ways to extract suitable reading lists, syllabuses, and curricula

We might, in fact, learn some new things about education and reading theory in this way. It is easy to imagine, as well, that different research programs would be based on the data or might seek to augment the data in various useful ways. If properly developed this could be of considerable interest, I imagine, not just to education theorists and library scientists but to teachers and researchers in any bookish discipline. In particular, I suspect that once we start generating bibliographies in this way, we will never go back to alphabetical order.