Research Interests of Hugh C. Williams
My work in computational number theory extends from analyzing the complexity of number theoretic algorithms to actually implementing and testing such algorithms. I have worked and continue to work in applications of computing techniques to solving algebraic and elementary number theoretic problems. I am currently interested in developing parallel algorithms for solving certain number theoretic algorithms, particularly regulator and class number computation in real quadratic number fields.
Computing invariants in quadratic fields
One of the oldest and most difficult problems facing anyone attempting to perform arithmetic in number fields or function fields is the computation of the regulator and the ideal class number. These invariants are not only fundamental to the structure of the number fields being investigated, but the difficulty of computing them provides some idea of how hard it is to break cryptosystems based on the corresponding field. Thus, developing efficient techniques for computing these invariants is of great importance.
The best method for computing the class number and regulator that is currently known is Mike Jacobson’s implementation of the subexponential algorithm; however, this technique computes these invariants of a real quadratic field only conditionally. Jacobson and I were able to extend slightly his earlier methods to compute regulators and class numbers of real quadratic fields with discriminants of 80 digits. Later, we were able to compute the regulator and class number of a quadratic field with a discriminant of 101 digits. These are the largest such invariants ever computed, and although they are conditional, display the power of these new techniques.
These ideas were also applied to the problem of finding certain quadratic polynomials that will produce many prime values. In order to compute the density of primes (under Hardy and Littlewood's conjecture F) for these polynomials, it is necessary to evaluate a very slowly converging infinite product. This can be transformed into a more rapidly converging product if the class number and regulator of a certain quadratic number field can be evaluated. As a result of this work, we were able to display a polynomial that has the largest asymptotic density of prime values for any polynomial of this type currently known.
Unconditional evaluation of the regulator
Although the regulator determined by Jacobson’s algorithm is only conditionally correct, it is easy to verify rigorously that this algorithm computes an integral multiple of the regulator. Having made this determination, the next problem is to establish that the integral multiplier of the regulator in Jacobson's regulator is 1. There are two phases to doing this:
Establish that the regulator must exceed some predetermined bound.
Prove that for no integer less than a certain amount can we have the regulator being Jacobson's regulator divided by that integer.
It is interesting that both of these phases can be parallelized. Furthermore, the technique should with an appropriate representation of the ideals involved be completely integral. That is, we should not at any point have to work with any numbers but integers. This means that we do not have to deal with approximations to irrational numbers and the concomitant loss of rigour that often occurs as a result. Currently, Jacobson and I are collaborating on this problem and we already have some preliminary results. Recently, we were able to compute unconditionally a regulator for a quadratic field with a 55-digit discriminant. This was done with only 16 processors; thus, when fully parallelized the new algorithm may allow us to compute regulators for fields of perhaps 60 or 65 digit discriminants.
Other problems
involving real quadratic fields
Our ability
to compute the regulator for fields of large discriminant allows us to
develop some new algorithms for attacking certain types of Diophantine
equations. We used this information to solve some difficult Diophantine
equations that were left unsolved in a paper of Attila Pettö. We
have also developed improved algorithms for computing the regulator and
class number for many fields of more modest sized discriminants. These
were used to verify the very difficult Ankeny-Artin-Chowla conjecture
for all primes less than 200,000,000,000, and to test the Cohen-Lenstra
heuristics on the distribution of the odd part of the class number.
Indeed, with our new techniques Herman te Riele and I were able to
compute all the class numbers for all real quadratic fields of prime
discriminant less than 200,000,000,000. The results obtained agreed
with what the Cohen-Lenstra heuristics predicted. This is the most
extensive numerical verification of this conjecture ever performed and
was only possible because of our new techniques.
Development
of continued fractions for quadratic irrationalities
It is an
extremely interesting and important problem to detect infinite families
of positive integers D for which one can readily describe the
fundamental unit (and the set of principal reduced ideals) in the
quadratic field formed by adjoining the square root of D to the
rationals. This problem is also connected to that of finding real
quadratic fields of large class number. The problem is very old and
several examples of such families are given in the 12th chapter of
Dickson's
History of the Theory of Numbers.
I have been
very interested in the problem of exhibiting new families of such D
values and determining the fundamental unit in the corresponding field.
Alf van der Poorten and I have discussed a class of cases D=F(X) where
F is a quadratic polynomial with leading coefficient a square, for
which one obtains particularly small units essentially because the
period length of the continued fraction expansion of the square root of
D is, somewhat surprisingly, independent of the integer parameter X.
Indeed, if a condition of Schinzel is satisfied and the absolute term C
of F(X) is positive and not a perfect square, then the continued
fraction of the square root of D can usually be expressed very simply
in terms of the continued fraction of the square root of C for any
sufficiently large X.
There are
certain parametric families of values of D for which the period length
of the continued fraction expansion of the square root of D is a linear
function of one of the parameters used in the representation of
D. Such families have been the subject of much study by certain
authors, particularly Leon Bernstein who noticed that there is a
pattern of cycles within the continued fraction period. I have been
able to show that this phenomenon of cycles occurs in much greater
generality than Bernstein was able to detect. I am currently working on
extending these results to more general families of values of D.
Cryptography,
the design and implementation of secure communication systems, plays a
vital role in e-commerce, general Internet usage, large data base
security and defence system security, to mention only a few. Indeed,
cryptography is an essential component of any installation in which
secure communication is needed.
If a sender
and receiver of a message wish to ensure that no other unauthorized
party can read their transmission, they will make use of a particular cryptosystem.
A
conventional
cryptosystem
can be thought of as a large collection of
transformations, any one of which will render the original message
unintelligible, but in order for the receiver to read the message, he
must know which particular transformation was used by the sender.
The
information that identifies the transformation used by the sender is
called the key. To maintain security, it is essential that the
key be kept secret from any possible adversary. It is important to
recognize that there are several possible objectives of the
eavesdropper. One of these may simply be to read the transmissions, but
it is also possible that he may wish to forge messages or send false
messages or simply delete transmissions. We distinguish between a
privacy system and an authentication system. A privacy
system protects against the unauthorized extraction of information from
a given transmission, whereas an authentication system protects against
the unauthorized injection of information into a transmission.
In both
privacy and authentication systems it is vital that the key be known to
only the sender and receiver of the messages. This means at some point
the key must be communicated between the sender and receiver over a
different and very secure transmission channel than that used for the
transformed (encrypted) messages. As such communication channels are
expensive and often inconvenient to use, the objective of much of
modern cryptography has been to try to eliminate them altogether.
The security
of the
Digital Signature Standard (DSS), a standard for electronic
signatures endorsed by the United States National Institute for
Standards and Technology (NIST), is based on the presumed difficulty of
the Discrete Logarithm Problem (DLP) in mathematical structures called
finite fields. Also, a widely used procedure for exchanging secret
cryptographic keys (the Diffie-Hellman protocol) derives its security
from the belief in the difficulty of the DLP in finite fields. Intense
investigation of this DLP has revealed that it is easier to solve than
was previously thought. The technique (the Number Field Sieve)
that was used to determine this result does not seem to be applicable
to the DLP over other mathematical structures such as algebraic number
fields and function fields. In particular, the DLP in function fields
of low genus has proved so far to be very resistant to attack by all
known methods. Indeed, the elliptic curve system, a special
case of function field based schemes, is widely used in commercial
applications. The study of the DLP in algebraic number fields and,
particularly, in function fields is still very much in its infancy, and
more research must be done before it can be adopted by institutions
such as banks, securities firms and the insurance industry.
Although
function fields as a basis for cryptography have several advantages
over number fields, such as greater speed and cleaner computer
implementation, it should be pointed out that from an algorithmic and
cryptographic perspective, they are less well understood than number
fields. It is important to investigate number fields for possible
applications in cryptographic settings. Furthermore, it is always sound
cryptographic practice to have access to as many different systems as
possible. This ensures that the sender has a choice of possible
schemes, a very useful feature if one or more of them is compromised.
The simplest
number fields are the quadratic fields; performing arithmetic in these
structures is relatively efficient and simple compared to doing this in
other algebraic number fields. Nevertheless, they still possess many of
the complicating features that make them resistant to methods that have
proved to be successful in other structures. There are two types of
quadratic fields: imaginary and real. The cryptographic aspects of the
former have been the subject of intense scrutiny by Professor Buchmann
and his research team at the Technical University of Darmstadt. One of
my interests is the investigation of algorithms for implementing key
exchange protocols and for solving the DLP in real quadratic fields.
In 1994,
Scheidler, Buchmann and I described a technique for conducting a secure
cryptographic key exchange by making use of the infrastructure of the
principal ideal class of a real quadratic field. This system involves
the communication partners agreeing on a certain reduced principal
ideal of the number field from which the information needed to produce
the key can be extracted by a mutually agreed-upon process. The
partners never exchange this ideal over their communication channel,
but instead provide each other with enough information that each can
compute it. However, anyone who is eavesdropping on his or her
communication channel will not get enough information to determine what
this ideal is. The security of this scheme depends upon the number of
reduced principal ideals in the quadratic number field and the
difficulty of solving the DLP in the field.
The first of
these problems is easily handled by use of the Cohen-Lenstra heuristics
on the distribution of the odd part of the class number. This strongly
suggests that the class number is very likely to be small when the discriminant,
an important invariant of the field, is a prime or 4 times a prime.
Thus, if the discriminant is chosen in this way and is about six
hundred bits, we can be assured that there will very likely be at least
2250 reduced principal
ideals, making it impossible to break the system by exhaustive search.
The only
reasonably likely security problem is the possibility that the DLP can
be solved quickly in the field. This problem has been examined by a
number of authors, and Jacobson’s implementation of the sub exponential
algorithm is the current state of the art in solving this problem. It
must be emphasized that all of this work is based on the assumption of
the truth of a generalized Riemann hypothesis (GRH), an assumption
about the location of the zeros of a certain function of a complex
variable. Nevertheless, there are still a number of problems associated
with Jacobson's algorithm. Giesbrecht, Jacobson and Strorjohann have
very recently addressed some of these regarding the matrix operations,
but there remains a still difficult problem, essentially involving
precision, that is very tricky to deal with. This is a subject that I
am currently investigating.
It is
important to realize that even the most improved version of Jacobson's
algorithm is still very slow when the values of the discriminant become
large. For example, it took more than 87 days for his algorithm
(implemented on 16 550MHz Pentium III computers) to compute the regulator,
another important invariant, of a certain real quadratic
field with a discriminant of 101 decimal digits. As computation of the
regulator is essentially an instance of the DLP, this gives some
indication of how difficult the DLP is to solve, particularly when one
considers that the discriminant in this case was an easy one. It seems
that it is possible to construct discriminants where the solution of
the DLP appears to be very difficult indeed.
Improved
implementation
The current
version of our key exchange protocol is still somewhat slow. This is
not necessarily that serious a problem because key exchanges are done
infrequently. However, if we want to use the method as a signature
scheme, speed becomes a more important consideration. By developing a
new representation for the ideals being considered, we have been able
to lower the precision needed at the expense of increasing the
complexity of the second communication round.
This latter
has proved to be no real problem as it executes much more rapidly than
the first round protocol and seems not to be needed after all because
after extensive testing we have found that the communication partners
always get the same key ideal after the first round. After profiling
our implementation we found that by far the most time-consuming part of
the protocol is the ideal reduction phase--97% of the time is spent on
ideal reduction. Clearly, any further attempts to optimize the
algorithm must begin there. One possible improvement is to incorporate
the NUCOMP algorithm of Shanks, which lowers the size of the numbers
needed in the computations. Our preliminary work on this indicates that
this should effect a considerable saving.
Benchmarking
As there is
no rigorous mathematical proof of the security of our (or almost any
other) cryptosystem, the only way we can certify its security and
effectiveness is to test it extensively. It will be necessary to
conduct very large-scale numerical experiments in order to acquire the
data needed to determine accurately the security of our cryptographic
scheme.
This should
be possible because the data generated by our hardware configuration
should allow for accurate and reliable extrapolation. We have funding
from Alberta Ingenuity, the Alberta Science and Research Investments
Program, the Canada Foundation for Innovation and the University of
Calgary to set up a high-speed computing laboratory for testing and
benchmarking cryptographic systems. The lab should be up and running
early in 2003.
Design and Development of
Special-purpose Hardware Devices
A number
sieve is a special-purpose computing device that finds solutions to
large systems of single variable linear congruences with various
distinct coprime moduli. The mechanism detects these solutions simply
by searching through all integers up to a certain, preselected bound.
While this approach may sound naive, it is in fact possible through a
judicious exploitation of parallelism to make these devices execute at
a very rapid rate. For solving certain problems, the use of these
machines is the fastest method known. The construction of these devices
has a very long history, with the first having been built in 1912. They
were originally developed to assist in the integer factoring problem,
but as better methods of doing this have evolved, they are now being
used for other purposes.
For many
years I have participated in the design and construction of number
sieves. The most recent of these devices, the MSSU, completed in 1995,
is the fastest and most powerful in existence. For most problems this
machine can canvass the integers at the rate of 1 to 3 trillion numbers
per second. This
rate is over two orders of magnitude faster than that of any previous
sieve, and while it may not be as fast as
some supercomputers, it operates at only a very small fraction of the
cost. I am currently working with a graduate student, utilizing
FPGA (Field Programmable Gate Array) technology, to produce a much
faster and more versatile device.
For the past
several years I have been much involved in applying number sieves to
the solution of a number of problems arising in number theory and
cryptography. These range through the tabulation of a difficult
function of Erdös and Selfridge, an investigation and tabulation
of pseudopowers, calculation of certain quadratic polynomials which
have a high density of prime values, development of primality tests and
tabulation of pseudosquares, to the determination of primes for which a
certain character sum must be positive.
A. Granville,
R. Mollin and I were able to use the MSSU to assist in proving a rather
difficult conjecture of Mollin concerning an upper bound on the least
inert prime in a real quadratic number field. I should also point out
that we also used the MSSU to show for the first time that both 110 and
435 can be written as a sum of three cubes, a problem of much interest
to
diophantine
analysts.
We have also
been able to find a use for a number sieve in cryptography. In the real
quadratic field based Diffie-Hellman protocol, it is necessary to use a
field with a discriminant that will frustrate as much as possible any
attempt by a cryptanalyst to break the system. We discovered that such
a discriminant should be a quadratic nonresidue for as many of the
small primes as possible. By using the MSSU, we were very efficiently
able to find quite large discriminants that are quadratic nonresidues
for all primes less than 527. It should be possible to improve this by
using a new idea for sieve implementation due to Dan Bernstein.
History of Mathematics and Computation
I have an interest in many aspects of the history of computation in mathematics, particularly in number theory. For example, I have collected and published much information on number sieves and I maintain a keen interest in the history of the Pell equation. Also, my book, Édouard Lucas and Primality Testing, is a research monograph on the history of primality testing from classical antiquity to today. In particular, I show the tremendous influence on primality testing wielded by Édouard Lucas, the first individual to show that primality testing could be performed, at least on certain forms of numbers, without recourse to the laborious and tedious process of trial division.
I then go on to show how the modern methods of primality testing have evolved since Lucas' time. Indeed, half of the book is devoted to developments that took place since Lucas' death in 1891. These methods culminate in the two main techniques that are used today: the Jacobi sums algorithm and the elliptic curve algorithm. I also was able to show that Lucas, before elliptic curves were even considered by mathematicians, was very close indeed to a test for primes that was similar to today's elliptic curve methods. Unfortunately, his premature death precluded any further development of his early ideas.