This is a mathematical analysis of the algorithm used by the National Residency Matching Program to assign prospective medical residents to hospitals. It was adapted from a lecture by Steven Rudich at Carnegie Mellon. The algorithm is described on the official web site, but this gives a slightly different characterization that is more convenient for analysis. It then proves some properties of the algorithm that may be reassuring to some students entering the match.

For simplicity, assume that each student ranks all of the hospitals and each hospital ranks all of the students. Each hospital has a certain number of positions they would like to fill. Assume the sum of these is at least as large as the total number of students. Given these assumptions, each student can be assigned to some hospital. Otherwise, some students may be unmatched, which just makes the analysis a bit more complicated, but its actually fairly simple to deal with, as I explain below.

You can think of the matching process as occurring over a series of days. It's all simulated on a computer, but in effect this is what happens: Each morning the students go to the highest hospital on their list that has not been eliminated on an earlier day. On the first day, each student will go to their top choice. Each hospital sorts the students who show up according to the hospital's preference.

If a hospital has 5 slots, then the top 5 students are told, "You can
come back tomorrow." This is the hospital's *candidate set* for
the day. Those students will definitely show up again the next day
because that hospital remains at the top of their list.

The other students are told, "You can never go here." They must cross the hospital off the top of their list. The next day they will go to the next-most preferred hospital. This continues until the day on which no student gets shafted. At that point, wherever the students are is their final hospital.

The first thing we can prove is that **the algorithm is guaranteed to
stop**. This is pretty clear. Each student begins with a list
containing a finite number of hospitals. Each day, at least one student
gets shafted, so the total number of entries on all the lists goes down
by at least one. The maximum number of days the algorithm could last is
the number of students times the number of hospitals.

The next thing to prove is that a hospital's candidate set can only
remain the same or improve as time goes on. That is to say, **the
worst student in the candidate set on one day will not be better than
the worst student in the set on a later day**. This is because any
student in the candidate set one day will definitely return on the next
day. The only reason they will ever be booted from the candidate set is
if a better student shows up. So the set can't get worse.

Next we need to introduce some more terminology. A *pairing* is an
assignment of a hospital to each student. A pairing is *unstable*
when student *X* is assigned hospital *A* but prefers
hospital *B* while hospital *B* prefers student *X* to
some other student, *Y*, that was given hospital *B*. A
pairing is *stable* when there is no such situation.

We can prove that **the matching algorithm produces a stable
pairing**. Let's assume that this is not true and that an unstable
pairing results, such that student *X* is assigned hospital
*A* but prefers hospital *B* and *B* prefers *X* but
was assigned *Y*. Since *X* prefers *A* to *B*,
*X* would have visited *A* before visiting *B*. Since
*X* didn't end up at *A*, *X* must have been booted from
*A* at some point. But somehow, *A* ended up with student
*Y* who is worse than student *X*. This violates the theorem
that *A*'s candidate set cannot get worse. Therefore, we have a
contradiction so the assumption must be false. Thus, the final pairing
is stable.

Incidentally, since the algorithm always produces a pairing and the
pairing is always stable, we can conclude that given any set of
preference lists, **there always exists at least one stable
pairing**. Now, there may be many stable pairings. For each student,
we can define the *possible hospitals* as the hospitals the student
is matched with in at least one of the stable pairings. A student's
*optimal hospital* is the one they prefer the most of their
possible hospitals. The *pessimal hospital* is the one they prefer
the least. Similarly, each hospital has a set of possible students and
an optimal and pessimal student.

A matching algorithm is said to be *student-optimal* if every
student is always matched to their optimal hospital. This is the best
of all possible stable matchings for every student. We will prove that
**this algorithm is student-optimal**.

If it does not produce a student-optimal pairing, then at some point a
student must have been rejected by his optimal hospital (since the
student works down his list by preference). Assume *X* is the
*first* such student and *A* is *X*'s preferred hospital.
Let *S* be the set of students that *A* chose over *X*
because *A* prefers all students in *S* to *X*. Since
*X* was the first student to be rejected by his optimal, no student
in *S* has been rejected by her optimal. Therefore, the optimal
for each student in *S* is either *A* or worse than *A*
(as far as that student is concerned). We conclude that **no student in
S can ever be matched with a hospital better than A**.

Now, because *A* is *X*'s optimal hospital, it must be in
*X*'s set of *possible hospitals*. Therefore, there must
exist at least one *stable* pairing that does match *X* and
*A*. But in that pairing, there must be at least one student from
*S* who is not paired with *A* (because *X* replaced
her). That student, *Y*, must be paired with a hospital worse than
*A* because of what we just concluded. Therefore, *Y* and
*A* form an unstable pair. *Y* prefers *A* to its
hospital and *A* prefers *Y* to *X*. Therefore, the
pairing is unstable, which is a contradiction.

Thus, the assumption that there can be a student who is rejected by his optimal hospital was false. As a result, the pairing is student-optimal. Incidentally, the original centralized matching system was started in 1950, but the algorithm did not always produce stable pairings. In 1952, they switched to a hospital-optimal version of the algorithm. In January 1998 the algorithm changed again, apparently to the current student-optimal form.

One can also prove that **the pairing produced by the algorithm is
hospital-pessimal**, meaning that a hospital is always assigned the
students from its set of possible students that it least wants. I won't
go into the proof in depth. But for those who want to play along at
home: start with a student-optimal pairing, assume that there exists a
stable pairing in which a hospital gets a worse student than they did in
the first pairing, we can show that the second pairing has an
instability, therefore the assumption was false.

To make the analysis easier, we made the assumptions that every student ranks every hospital and vice-versa and that there are enough spots for all the students. In reality, of course, this is not the case.

However, we can make reality fit the idealized model if we assume that
there exists one extra hospital. This is known as the
**You-Want-Fries-With-That Center For Cardiovascular Research**, or
YWFWTCCR. It goes on a students' preference list just below the last
hospital that they actually listed and above all hospitals they failed
to list. The nice thing is that YWFWTCCR has unlimited spots. So if
your top choices don't work out, you're still guaranteed a job. And
maybe even your very own uniform and visor!

- The guidelines on the official site are basically trustworthy.
- You should list schools in the actual order you want to go to them. You should not promote certain schools in your rankings because you think they are more or less likely to accept you than other schools. However, this only applies to the order of the schools on your list. If the decision is whether to fill your last slot with a very selective school or a safety school, then your chances of not being matched should be considered. But since they allow you to list an unlimited number of schools, that doesn't really apply.
- Adding extra schools to your list (assuming you would be willing to attend them) will not dilute or reduce your chances of getting into the higher-ranked schools. Adding an extra school simply reduces the chance that you aren't matched at all.
- Because the process is school-pessimal, program-directors might think that inverting their ranking of students could lead to a better chance of getting their true preferences. However, this is not the case. Assuming students produce reasonable rankings, reducing the ranking of a student will only increase the chance that a matching with that student is unstable.

Douglas Rohde Last modified: Mon Mar 15 17:53:29 EST 1999