%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentstyle[11pt]{article}
 
% DINA4-Format
%\oddsidemargin 6pt\evensidemargin 6pt\marginparwidth 48pt\marginparsep 10pt
%\topmargin -18pt\headheight 12pt\headsep 25pt\footheight 12pt\footskip 30pt
%\textheight 625pt\textwidth 431pt\columnsep 10pt\columnseprule 0pt

\oddsidemargin 6pt\evensidemargin 6pt\marginparwidth 48pt\marginparsep 10pt
\topmargin -18pt\headheight 12pt\headsep 25pt\footheight 12pt\footskip 30pt
\textheight 625pt\textwidth 448pt\columnsep 10pt\columnseprule 0pt

%\tolerance=2000
%\textwidth 6.55in
%\textheight 9.0in
%\oddsidemargin 0.001in
%\evensidemargin 0.001in
%\topmargin -0.4in
 

 
\sloppy

% Dirty trick to avoid italic in theorems
\catcode`@=11
\def\@begintheorem#1#2{\trivlist \item[\hskip \labelsep{\bf #1\ #2.}]}
\def\@opargbegintheorem#1#2#3{\trivlist
      \item[\hskip \labelsep{\bf #1\ #2.\ (#3)}]}
\catcode`@=12


\newenvironment{rem}{\vspace{3mm}\noindent{\bf Remark. }}{\vspace{3mm}}
\newenvironment{claim}{\vspace{3mm}\noindent{\bf Claim. }}{\vspace{3mm}}
\newtheorem{Theorem}{Theorem}[section]
\newtheorem{Lemma}[Theorem]{Lemma}
\newtheorem{Cor}[Theorem]{Corollary}
\newtheorem{Def}[Theorem]{Definition}
\newtheorem{Prop}[Theorem]{Proposition}
 
\newcommand{\theorem}[1]{\begin{Theorem} #1 \end{Theorem}}
 
\newcommand{\theoremcite}[2]{\begin{Theorem}{\rm\cite{#1}}~~#2
\end{Theorem}}
 
\newcommand{\proof}[1]{ \noindent {\it Proof.}
#1 \mbox{}\vspace{\baselineskip}\mbox{}\hspace*{\fill}$\Box$}
 
\newcommand{\proofof}[2]{ \noindent {\it Proof of #1.}
#2 \mbox{}\vspace{\baselineskip}\mbox{}\hfill$\Box$}
 
\newcommand{\lemma}[1]{\begin{Lemma}  #1 \end{Lemma}}
 
\newcommand{\cor}[1]{\begin{Cor}  #1 \end{Cor}}
\newcommand{\prop}[1]{\begin{Prop}  #1 \end{Prop}}
\newcommand{\defi}[1]{\begin{Def}  #1 \end{Def}}
 
\newcommand{\deficite}[2]{\begin{Def}{\rm\cite{#1}}~~#2
\end{Def}}
 
% Enumeration with alpha items
\newcommand{\alphaitems}{\renewcommand{\labelenumi}{\mbox{(}\/\alph{enumi}\/\mbox{)}}}
% Enumeration with roman items
\newcommand{\romanitems}{\renewcommand{\labelenumi}{\mbox{(}\/\roman{enumi}\/\mbox{)}}}


%***********
% From Jim Royer --KWR
% The following is a terrible hack, but it works---usually.
\newcommand{\parityP}{{\mathchoice{\raisebox{1pt}{$\displaystyle\oplus$}{\rm
   P}}{\raisebox{1pt}{$\textstyle\oplus$}{\rm
P}}{\raisebox{1pt}{$\scriptstyle\oplus$}{\rm
P}}{\raisebox{1pt}{$\scriptscriptstyle\oplus$}{\rm P}}}}
\newcommand{\Pe}{{\rm P}}
\newcommand{\NP}{{\rm NP}}
\newcommand{\FP}{{\rm FP}}
\newcommand{\PH}{{\rm PH}}
\newcommand{\CH}{{\rm CH}}
\newcommand{\Mod}{{\rm Mod}}
\newcommand{\mymod}{{\mbox {\rm ~mod~}}}
\newcommand{\ModkP}{\mbox{\rm Mod$_k\Pe$}}
\newcommand{\ModpP}{\mbox{\rm Mod$_p\Pe$}}
\newcommand{\PP}{{\rm PP}}
\newcommand{\ParP}{{\mbox{$\oplus$\Pe}}}
\newcommand{\pair}[2]{\langle #1,#2\rangle}
\newcommand{\triple}[3]{\langle #1,#2,#3\rangle}
\newcommand{\alphabar}{\overline \alpha}
\newcommand{\xibar}{\overline \xi}
\newcommand{\omegabar}{\overline \omega}
\newcommand{\polylog}{{\it polylog\/}}
\newcommand{\Ree}{\mbox {\cal Re}}
\newcommand{\Vector}[1]{\mbox {\rm \bf #1}}
\newcommand{\mathvector}[1]{\mbox {\boldmath $#1$}}
\newcommand{\nat}{\Vector{N}}
%\newcommand{\iff}{\, \Longleftrightarrow \,}
\newcommand{\implies}{\, \Longrightarrow \,}
\newcommand{\aarg}{\mbox {\rm arg}}
\newcommand{\N}{{\rm {\bf N}}}
\newcommand {\Z} {{\rm {\bf Z}}}

 
\newlength{\g}
\settowidth{\g}{=}
\newcommand{\Cg}{\sf C\hspace{-0.45\g}\raisebox{.15ex}{\footnotesize =}\hspace{-
   .15\g}}
\newcommand{\Cgex}{\footnotesize \sf C\hspace{-0.4\g}\raisebox{.18ex}{\tiny =}
                     \hspace{-.15\g}}
 
\newcommand     {\nule}         {\{0,1\}}
\newcommand     {\x}            {(|x|)}
 
\newcommand{\pmredu}{\mathrel{\hbox{$\leq_{\rm m}^{\rm p}$}}}
\newcommand{\AND}{\; \wedge \;}
 
   
\begin{document}

%\nopagenumbers


\title{Exponential Sums and
Circuits with a Single Threshold Gate and Mod-Gates}
\author{
Frederic Green  \\{\small Department of Mathematics and Computer Science}
                \\{\small Clark University}
                \\{\small Worcester, Massachusetts 01610}               
                \\{\small {\tt fgreen@black.clarku.edu}}
       }
\date{}
\maketitle

\begin{abstract}
  Consider circuits consisting of a threshold gate at the top, Mod$_m$ gates at
the next level (for a fixed $m$), and polylog fan-in AND gates at the lowest
level. It is a difficult and important open problem to obtain exponential lower
bounds for such circuits. Here we prove exponential lower bounds
for restricted versions of this model, in which each Mod$_m$-of-AND
subcircuit is a symmetric function of the inputs to that subcircuit. We show
that if $q$ is a prime not dividing $m$, the Mod$_q$ function requires
exponential size circuits of this type. This generalizes recent results and
techniques of Cai, Green and Thierauf [CGT] (which held only for $q=2$) and
Goldmann (which held only for depth two threshold over Mod$_m$ circuits). As a
further generalization of the [CGT] result, the symmetry condition on the
Mod$_m$ sub-circuits can be relaxed somewhat, still resulting in an
exponential lower bound. The basis of the proof is to reduce the problem to
estimating an exponential sum, which generalizes the notion of ``correlation"
studied by [CGT]. This identifies the type of exponential sum that will be
instrumental in proving the general case. Along the way we substantially
simplify previous proofs.  



\end{abstract}

\thispagestyle{empty}

\pagebreak

\setcounter{page}{1}

\section{Introduction}

   Proving exponential lower bounds against families of certain bounded depth
circuits presents complexity theory with some formidable problems. There was
much success in the early and middle 80's with the breakthrough results of
Furst, Saxe, and Sipser \cite{FSS}, Yao \cite{Yao1}, Cai \cite{Cai} and
H{\aa}stad \cite{Has} for AC$^{(0)}$ circuits, and those of Razborov
\cite{Raz} and Smolensky \cite{Sm} for AC$^{(0)}$ with Mod$_{p^k}$ gates for
a fixed prime power $p^k$. Since then, there has been little progress for
bounded-depth circuits containing Mod$_m$ gates for general composite $m$
(called ACC circuits \cite{Bar} ). On the other hand, there have been some
quite dramatic upper bounds for ACC (\cite{Yao2}, \cite{BT}, \cite{GKRST}).
Building on work of Toda \cite{Toda}, Yao \cite{Yao2} showed that ACC can be
simulated by depth 2 probabilistic circuits consisting of a symmetric gate
over polylog-size AND's. Beigel and Tarui \cite{BT} improved this by showing
that the simulation can be deterministic. Green, K{\"o}bler, Regan,
Schwentick and Tor{\'a}n \cite{GKRST} further showed that the top gate can be
one that computes a particular symmetric function (the so-called ``MidBit"
function that outputs the middle bit of the sum of the inputs).

  In Allender's original work \cite{Allender} proving the circuit
analogue of Toda's theorem, it was shown that AC$^{(0)}$ can be
simulated by quasi-polynomial size circuits with a threshold at the root, a
layer of parity gates below that, and a layer of polylog-size AND's attached
to the inputs. (In terms of polynomial-time classes, this is equivalent to 
PH $\subseteq$ PP$^{\parityP}$ relative to all oracles). Such
circuits appear to be weaker than the symmetric over AND's of Beigel and Tarui
\cite{BT} or the ``MidBit" over AND's of \cite{GKRST}. Nevertheless, in view
of the surprising power of the middle bit representation of ACC, it is natural
to ask, can the upper bounds on ACC be improved? In
particular, can ACC be simulated by depth three circuits of the type just
mentioned, of quasi-polynomial size, with a threshold on top, parity gates at
the next level, and small AND's at the bottom?

  The intuitive answer to this question is ``no." The purpose of this
note is to report on progress towards proving this conjecture. In fact, it
is natural to suspect that even the Mod$_3$ function cannot be so computed.
(This would imply an oracle $A$ such that Mod$_3$P$^A \not \subseteq
\PP^{\parityP^A}$, which is significant since no oracle is known separating
$\PP^{\parityP}$ from PSPACE.) There is some evidence to support this. Cai,
Green and Thierauf \cite{CGT} proved an exponential bound for the parity
function against circuits with a threshold over Mod$_m$'s (for odd $m$) over
small AND's, but with a severe restriction on the Mod-of-AND subcircuits:
Each Mod-of-AND subcircuit computes a {\it symmetric} function of the inputs.
As severe as this restriction is, it is reasonable to determine if our
circuit can take direct advantage of the fact that parity is a
symmetric function, and the lower bound says that it cannot. The key
contribution of \cite{CGT} was to show that the general problem (i.e., with
unrestricted polynomials) reduces to an important type of problem that
often arises in number theory: estimating an exponential sum.

  Unfortunately, it was not at all clear from \cite{CGT} how to obtain
lower bounds for any mod function other than parity. The quantity computed
in \cite{CGT}, the ``correlation," is really only appropriate for lower
bounds on the parity function. The main contribution of this paper is to
develop a technique for treating other mod functions, again via
exponential sums. Using this formulation, before generalizing \cite{CGT}, we
present a new proof of a result of Goldmann \cite{Go}: The Mod$_q$ function
requires exponential size on circuits consisting of a threshold over Mod$_m$
gates, where $q$ and $m$ are relatively prime. (This theorem was extended to
apply to thresholds of unbounded weight by Krause and Pudl{\'a}k \cite{KP}.)
The advantage of this approach is that the proof reduces to a routine
calculation. A very simple proof of the main lemma of \cite{CGT} is presented
here, which makes this paper almost entirely self-contained. Finally, the main
theorem shows that the symmetry constraint on the Mod-of-AND subcircuits can
be relaxed a little bit. One may allow the Mod-of-AND gates to be symmetric in
pairwise disjoint subsets of inputs, while still obtaining an exponential
lower bound. In a final section the outlook for the general case is
discussed in light of these new results.

   The introduction concludes with some remarks regarding notation and
conventions.  As is standard, any circuit
should be understood to be an element of an infinite family of circuits, one
for each number of inputs $n$. A Mod$_m$ gate over $n$ Boolean inputs
$x_1,...,x_n$ computes the Mod$_m$ function, which returns 1 iff $\sum_{i=1}^n
x_i \not \equiv 0~(m)$ (we use the slightly more compact notation $a \equiv
b~(m)$ for $a \equiv b~($mod$~m)$). A threshold gate over $n$ Boolean inputs
$x_1,...,x_n$ with weights $w_i \in \Z$ ($1 \le i \le n$) and threshold $t
\in \Z$, with $0 \le t \le n$, returns 1 iff $\sum_{i=1}^n w_i x_i \ge t$.
Our results apply to threshold gates with weights polynomially bounded in
$n$. As in \cite{GKRST}, if $G$ is a Boolean gate,
$G^+$ denotes a family of circuits with $G$ at the root and AND gates of
polylog fan-in at the input level. If $G$ is a Boolean gate and
${\cal C}$ is a circuit class, $G$-of-${\cal C}$ denotes the class of
circuits with ${\cal C}$-type circuits serving as inputs to $G$-type gates.
Thus the main concern of this paper is threshold-of-Mod$_m^+$ circuits. Note
that the size of the ${\cal C}$ subcircuits is to be regarded as a function of
the number of inputs to the global $G$-of-${\cal C}$ circuit. 


\section{Results}

  The proof of the main theorem is based on the ``$\epsilon$-discriminator"
lemma of Hajnal, Maass, Pudl{\'a}k, Szegedy and Tur{\'a}n \cite{HMPST}. It is
the basis for most of the lower bound proofs on depth three (or more)
circuits with a threshold on top (e.g., \cite{HMPST}, \cite{Gr91}, \cite{HG}).

\lemma{\label{discriminator}
  Let $T$ be a threshold circuit consisting of a threshold gate over
subcircuits $c_1,...,c_s$, each taking up to $n$ inputs. Thus, $T$ outputs 1
on input ${\bf x} \in \{0,1\}^n$ if and only if $\sum_{i=1}^s c_i({\bf x}) \ge
t$, for some integer $t$ which is fixed for the circuit $T$. Let $T$ compute
the Boolean function $f:\{0,1\}^n \rightarrow \{0,1\}$. Let $A \subseteq
f^{-1}(1)$ and $B \subseteq f^{-1}(0)$, and define the quantity,
$$\Delta_i = Pr(c_i({\bf x})=1 | {\bf x} \in A)
             - Pr(c_i({\bf x})=1 | {\bf x} \in B),$$
where $Pr$ denotes the probability over the uniform distribution over all
input settings. If $|\Delta_i| \le \epsilon$ for all $1 \le i \le s$,
then $s \ge \epsilon^{-1}$.
}

  The lemma is useful because if we can show that $|\Delta_i|$ is
exponentially small, then the circuit $T$ computing $f$ is exponentially large.
Most of the work is in computing a good upper bound for $|\Delta_i|$. But the
first crucial step is finding suitable sets $A$ and $B$. In the case considered
here, in which $f$ is the Mod$_q$ function, we choose sets which correspond
to the distribution used by Goldmann \cite{Go}, giving the most
unbiased possible situation. Let
\begin{eqnarray*}
  A & = & \{{\bf x} | \sum\limits_{i=1}^n x_i \equiv 1~(q)\}\\
  B & = & \{{\bf x} | \sum\limits_{i=1}^n x_i \equiv 0~(q)\}
\end{eqnarray*}
Thus, for the type of circuits considered here, namely
threshold-of-Mod$_m^+$, the quantity $\Delta$ has the form,
\begin{eqnarray}\label{delta-form}
  \Delta = Pr(c = 1 | \sum\limits_i x_i = 1~(q))
           - Pr(c=1 | \sum\limits_i x_i = 0~(q)),
\end{eqnarray}
where $c$ denotes a Mod$_m^+$ circuit.

For Mod$_m^+$ circuits, the evaluation of $\Delta$ reduces to
the evaluation of a certain exponential sum. The next lemma gives this
reduction.  Before stating it, we need some notation. For any natural
number $m$, $\zeta_m = e^{{2\pi i \over m}}$ denotes the complex primitive
$m^{th}$ root of unity. Recall that,
\[ \sum \limits_{a=0}^{m-1} \zeta_m^{ak}
     =   \left\{ \begin{array}{ll}
                     m   & \mbox{if $k \equiv 0 ~(m)$} \\
                     0   & \mbox{otherwise.}
                 \end{array}
         \right. \]
Let $t:\{0,1\}^n \rightarrow \Z$ be an
integer polynomial. The exponential sum $S(n,m,q;t)$ is defined as,
\begin{eqnarray}\label{S-form}
 S(n,m,q;t) = {1 \over m2^n} \sum_{{\bf x} \in \{0,1\}^n}
           \sum\limits_{a=0}^{m-1} \sum\limits_{b=0}^{q-1} (\zeta_q^{-b}-1)
                              \zeta_q^{b\sum_i x_i}
                              \zeta_m^{at({\bf x})}
\end{eqnarray}
Since $S(n,m,q;t)$ has so many arguments, when it is clear from the context
we will simply write $S$.

\lemma{\label{reduction}
  Let $n$ be any natural number, $q$ a prime which does not divide $m$,
and $c$ a Mod$_m$-of-AND circuit over $n$ inputs. Let $t$ denote the integer
polynomial such that $t({\bf x}) \not \equiv 0~(m)$ if and only if $c$
outputs 1. Define $\Delta$ as in equation (\ref{delta-form}) and $S(n,m,q;t)$
as in equation (\ref{S-form}). Then,
$$ |\Delta| \le |S| + 2^{-\Omega(n)}.$$
 }
\proof{
 To put $\Delta$ in the correct form, first consider the individual terms.
From the definition of conditional probability, for any integer $k$,
$$ Pr(c=1|\sum_i x_i \equiv k~(q)) = 
    {Pr(c=1 \AND \sum_i x_i \equiv k~(q)) \over
     Pr(\sum_i x_i \equiv k~(q))}.$$
 Now $Pr(\sum_i x_i
\equiv 1~(q))$ and $Pr(\sum_i x_i \equiv 0~(q))$ are very close; in fact, it is
well known \cite{Go} that they are exponentially close to ${1 \over q}$:
$${1 \over q} - 2^{-\alpha n}  \le
  Pr(\sum_i x_i \equiv k~(q)) \le {1 \over q} + 2^{-\alpha n},$$
for some
constant $\alpha$ depending on $k$. (This fact can be derived from
the techniques of this proof, in particular, those which give inequality
(\ref{little-zeta-sums}) below.) It follows that,
\begin{eqnarray*}
 |\Delta| \le q \left|
                  Pr(c=1 \AND \sum_i x_i \equiv 1~(q))-
                        Pr(c=1 \AND \sum_i x_i \equiv 0~(q))\right|
         ~~+ 2^{-\Omega(n)}.
\end{eqnarray*}
For any function $f:\{0,1\}^n \rightarrow \{${\it true, false}$\}$, write
$[f({\bf x})] =1$ if $f({\bf x})$ is {\it true} and $[f({\bf x})]=0$ if
$f({\bf x})$ is {\it false}. (Henceforth in this proof, square brackets will
be used only this context.) Thus, for example, $$Pr(c=1 \AND \sum_i x_i \equiv
1~(q)) = 2^{-n} \sum_{{\bf x} \in \{0,1\}^n} [c=1 \AND \sum_i x_i \equiv
1~(q)].$$ Then we may re-write the bound on $|\Delta|$ as,
\begin{eqnarray*}
|\Delta| & \le & q2^{-n} \left|\sum\limits_{{\bf x} \in \{0,1\}^n}
      \left( [c=1 \AND \sum_i x_i \equiv 1~(q)]
   - [c =1 \AND \sum_i x_i \equiv 0~(q)] \right)\right|~~+ 2^{-\Omega(n)}\\
   & = & 
  q2^{-n} \left|\sum\limits_{{\bf x} \in \{0,1\}^n}
      \left( [c=1 ]\cdot[ \sum_i x_i \equiv 1~(q)]
   - [c =1 ]\cdot[ \sum_i x_i \equiv 0~(q)] \right)\right|~~+ 2^{-\Omega(n)}
\end{eqnarray*}
By hypothesis, there is a polynomial $t$
in $x_1,...,x_n$ with integer coefficients such that $c({\bf x})=1$
iff $t({\bf x}) \not \equiv 0~(m)$. Observe, then, that
$$ [c({\bf x})=1] = 1 -
  {1 \over m} \sum \limits_{a=0}^{m-1} 
               \zeta_m^{at({\bf x})}.$$
 Similarly,
$$ [\sum_i x_i \equiv k~(q)] = 
       {1 \over q}\sum\limits_{b=0}^{q-1} 
       \zeta_q^{b (\sum_i x_i - k)},$$
for any $k$. Using these identities,\\

\noindent$ |\Delta|  \le  2^{-\Omega(n)} +$
\begin{eqnarray}\label{big-delta-expression}
q2^{-n} \left|\sum\limits_{{\bf x} \in \{0,1\}^n}
       \left( (1-{1 \over m} \sum_a \zeta_m^{at({\bf x})})
              ({1 \over q} \sum_b
                   \zeta_q^{b(\sum_i x_i-1)})
                - 
              (1-{1 \over m}\sum_a \zeta_m^{at({\bf x})})
              ({1 \over q}\sum_b \zeta_q^{b\sum_i x_i})\right)\right|.
\end{eqnarray}
Expand the expression in the summand, and note the terms that appear. We now
claim that,
\begin{eqnarray}\label{little-zeta-sums}
2^{-n}\left|\sum\limits_{{\bf x} \in \{0,1\}^n}{1 \over q}
             \sum\limits_{b=0}^{q-1} (\zeta_q^{b\sum_i x_i}
            - \zeta_q^{b(\sum_i x_i - 1)})\right|
       \le 2^{-\Omega(n)}.
\end{eqnarray}
To see this, perform the sum over the $x_i$'s to obtain,
\begin{eqnarray*}
  \sum \limits_{{\bf x} \in \{0,1\}^n} \zeta_q^{b\sum_ix_i}
        & = & (1 + \zeta_q^b)^n,\\
   \sum \limits_{{\bf x} \in \{0,1\}^n} \zeta_q^{b(\sum_ix_i-1)}
        & = & \zeta_q^{-b}(1 + \zeta_q^b)^n
\end{eqnarray*}
and hence,
$$2^{-n}\left(\sum\limits_{{\bf x} \in \{0,1\}^n}{1 \over q}
              \sum\limits_{b=0}^{q-1} (\zeta_q^{b\sum_i x_i}
            - \zeta_q^{b(\sum_i x_i - 1)})\right)
    = {1 \over q2^n}\sum\limits_{b=0}^{q-1}(1-\zeta_q^{-b})(1+\zeta_q^b)^n.$$
Thus, by the triangle inequality,
$$2^{-n}\left|\sum\limits_{{\bf x} \in
\{0,1\}^n}{1 \over q}
              \sum\limits_{b=0}^{q-1} (\zeta_q^{b\sum_i x_i}
            - \zeta_q^{b(\sum_i x_i - 1)})\right|
    \le {1 \over q2^n}\sum\limits_{b=0}^{q-1}|1-\zeta_q^{-b}|\cdot|1+\zeta_q^b|^n.$$
Note that the $b=0$ term does not contribute. Thus the maximum value that
$|1+\zeta_q^b|$ can take on is $|1+\zeta_q| = 2\cos(\pi / q)$. Including the
factor of $2^{-n}$ and noting that $\cos(\pi /q)$ is a constant less than 1,
inequality (\ref{little-zeta-sums}) follows.

  After simplifying inequality (\ref{big-delta-expression}), applying the
triangle inequality, and taking into account equation
(\ref{little-zeta-sums}), the lemma is proved.


}%end proof lemma "reduction"


  Lemmas \ref{discriminator} and \ref{reduction} show that obtaining an
exponential lower bound on threshold-of-Mod$_m^+$ circuits reduces to finding
an exponentially small upper bound on $|S(n,m,q;t)|$ where $t$ is a low degree
polynomial. Now $S$ is of the form,
$$ S(n,m,q;t) = {1 \over m} \sum\limits_{a=0}^{m-1}\sum\limits_{b=0}^{q-1}
                             (\zeta_q^{-b} -1) \sigma(n,m,q,a,b;t),$$
where we define,
\begin{eqnarray}\label{sigmadef}
  \sigma(n,m,q,a,b;t) = {1\over 2^n} \sum\limits_{{\bf x} \in \{0,1\}^n}
                         \zeta_q^{b \sum_i x_i} \zeta_m^{at({\bf x})}.
\end{eqnarray}
(Again, we often omit the arguments of $\sigma(n,m,q,a,b;t)$, and simply write
$\sigma$.) Thus the circuit lower bound reduces to finding exponentially small
upper bounds on $|\sigma(n,m,q,a,b;t)|$. Note that we may assume that $b \not
\equiv 0~~(q)$, since for $b \equiv 0~~(q)$ the terms in the sum for $S$ vanish.

  Before treating the more general case, using the methods of this paper we
prove the result of Goldmann \cite{Go}, that the Mod$_q$ function requires
exponential-size threshold-of-Mod$_m$ circuits. Although
it is much simpler, the proof contains some of the essential ingredients of
the proof of the main theorem. The key is to show that $\sigma$ factorizes
into $n$ factors of norm less than 1.

\theorem{\label{level-2}
  Let $q$ be prime and $m$ be a number such that $q$ does not divide $m$. Let
{\it C} be a threshold-of-Mod$_m$ circuit which computes the Mod$_q$ function.
Then
 the number of gates in $C$ is
$2^{\Omega(n)}$.
 }
\proof{
 Define $S(n,m,q;t)$ and $\sigma(n,m,q,a,b;t)$ as above. 
In this case, the
polynomial $t$ corresponding to a Mod$_m$ subcircuit is {\it linear}, so we
may write $t(x_1,...,x_n) = \sum_{i=1}^n k_i x_i$ for integers $k_i, 1 \le i
\le n$. By Lemmas \ref{discriminator} and \ref{reduction}, it suffices to
prove an exponentially small upper bound on $|\sigma|$ for $b
\not \equiv 0~(q)$ (since $b=0$ does not contribute to $S$). Then, we
can explicitly perform the sum over the $x_i$'s,
\begin{eqnarray*}
  \sigma & = & {1 \over 2^n} \sum\limits_{{\bf x} \in \{0,1\}^n}
               \zeta_q^{b \sum_i x_i} \zeta_m^{a\sum_{i=1}^n k_i x_i}\\
         & = & {1 \over 2^n} \prod\limits_{i=1}^n
                  (1 + \zeta_q^b \zeta_m^{ak_i})\\
         & = &  {1 \over 2^n} \prod\limits_{i=1}^n
                  (1 + \zeta_{qm}^{bm + aqk_i}).
\end{eqnarray*}
If $bm + aqk_i \equiv 0~(qm)$, it follows that $b \equiv 0~(q)$, contrary to
what we assumed. Hence $bm + aqk_i \not \equiv 0~(qm)$, and hence, 
$$ \zeta_{qm}^{bm + aqk_i} \not = 1.$$
Thus the greatest possible value for $|1+\zeta_{qm}^{bm + aqk_i}|$ is
$|1+\zeta_{qm}| = 2\cos{\pi \over qm}$, which implies,
 $$ |\sigma| \le
\left(\cos{\pi \over qm}\right)^n.$$ Since $\cos{\pi \over qm} < 1$ is a
constant, the result follows.
 }
 

  It appears to be a difficult problem to prove the required upper bound
on $|\sigma|$ for polynomials $t$ of degree more than 1. The following
gives a general estimate which doesn't depend on any
assumptions about $t$. Although this estimate is weak, we will make use
of it. Subsequently, stronger bounds are obtained when we place constraints on
$t$.

\lemma{\label{basic-sigma-bd}
  Let $q$ be a prime dividing neither $m$ nor $b$. Then,
$$ |\sigma(n,m,q,a,b;t)| \le \cos\left({\pi \over qm}\right).$$
}
\proof{The proof is similar to that of Theorem~\ref{level-2}, the difference
being that we only perform the sum over $x_1$. Details are given in the
Appendix.}

  
  Suppose $t:\{0,1\}^n \rightarrow \Z$ is a symmetric polynomial. Since it is
then a sum of elementary symmetric polynomials, it is a function only of the
sum of the inputs $\sum_i x_i$. Thus $t$ can be
represented by a function $r : \{0,...,n\} \rightarrow \Z$, where $r(k)$ is
the value of $t$ when $\sum_i x_i = k$. In this case we say $t$ is {\it
symmetric via $r$}. It is known (see, e.g., \cite{BBR}) that $r(k)$ is
periodic in $\Z/m\Z$. The period $D$ of $r$ mod $m$ is defined to be the
minimum $D > 0$ such that for all $k$, $r(k+D) \equiv r(k)~(m)$. $D$ is
bounded by a polynomial in the degree of $t$, and can be easily expressed in
terms of the prime factors of $m$ \cite{CGT}.

  We can obtain much stronger upper bounds than Lemma~\ref{basic-sigma-bd} for
$|\sigma(n,m,q,a,b;t)|$ when $t$ is symmetric. In order to obtain these
bounds, we will need the following lemma, originally obtained in \cite{CGT}.
Here we also present a much-simplified proof.


\lemma{\label{binom-sum}
For any natural numbers $D, k, n$,
  $$\sum \limits_{j \equiv k~~(D)}^n {n \choose j} = {1 \over D}
           \sum \limits_{\ell = 0}^{D-1}
\zeta_D^{k\ell}(1+\zeta_D^{-\ell})^n.$$ }
\proof{Using the fact that,
   \[ {1 \over D} \sum \limits_{\ell=0}^{D-1} \zeta_D^{\ell(k-j)}
     =   \left\{ \begin{array}{ll}
                     1   & \mbox{if $j \equiv k~~(D)$} \\
                     0   & \mbox{otherwise.}
                 \end{array}
         \right. \]
it follows that,
\begin{eqnarray*}
  \sum \limits_{j \equiv k~~(D)}^n {n \choose j} & = &
         \sum \limits_{j=0}^n {1 \over D} \sum \limits_{\ell=0}^{D-1}
                    \zeta_D^{\ell(k-j)} {n \choose j}\\
   & = & {1 \over D} \sum \limits_{\ell = 0}^{D-1} \zeta_D^{k\ell}
            \sum \limits_{j=0}^n \zeta_D^{-j\ell} {n \choose j}\\
   & = & {1 \over D}
           \sum \limits_{\ell = 0}^{D-1}
             \zeta_D^{k\ell}(1+\zeta_D^{-\ell})^n,
\end{eqnarray*}
where the last equality follows from the binomial theorem.
}%% binom-sum

 The following is the main technical lemma, which gives an upper
bound for $|\sigma|$ in the event that $t$ is symmetric. Note that this
is an exponentially small upper bound as long as the period $D$ grows slowly
enough. Fortunately, this is the case in the main theorem. It would be nice
to obtain a stronger upper bound (e.g., $2^{-\Omega(n)}$ even when $D$ is
not a constant), but it is not clear how to eliminate the factors of
$1/(qD)$ in the exponent.

\lemma{\label{better-sigma-bd}
  Let $n$ be any natural number, $q$ a prime dividing neither $m$ nor $b$,
and let $t:\{0,1\}^n \rightarrow \Z$ be symmetric via $r$. Let the function
$r(k)$ have period $D$ mod $m$. Then,
$$ |\sigma(n,m,q,a,b;t)| \le
     \left(\cos\left({\pi \over qm}\right)\right)^{{n \over (qD)^3}}.$$
}
\proof{
When $n < (qD)^3$, the conclusion is clear from
Lemma~\ref{basic-sigma-bd}, since in that case, noting that 
$\cos\left({\pi \over qm}\right) < 1$, 
  $$|\sigma| \le \cos\left({\pi \over qm}\right) \le 
     \left(\cos\left({\pi \over qm}\right)\right)^{{n \over (qD)^3}}.$$

  Therefore, assume for the remainder of the proof that $n \ge (qD)^3$.

  Using the definition of $\sigma$, the fact that the function
$\zeta_q^{bk}\zeta_m^{ar(k)}$ has period $qD$, and
Lemma~\ref{binom-sum},
\begin{eqnarray*}
  \sigma  & = & {1 \over 2^n}
                \sum\limits_{j=0}^n
                \zeta_q^{bj}\zeta_m^{ar(j)} {n \choose j}\\
          & = & {1 \over 2^n}
                \sum\limits_{k=0}^{qD-1} \zeta_q^{bk}\zeta_m^{ar(k)}
                \sum\limits_{j\equiv k~(qD)}^n {n \choose j} \\
          & = & {1 \over 2^n}
                \sum\limits_{k=0}^{qD-1} \zeta_q^{bk}\zeta_m^{ar(k)}
                {1 \over qD} \sum\limits_{\ell=0}^{qD-1}
                   \zeta_{qD}^{k\ell}(1+\zeta_{qD}^{-\ell})^n \\
          & = & {1 \over 2^n} {1 \over qD}
                \sum\limits_{\ell=0}^{qD-1} (1+\zeta_{qD}^{-\ell})^n
                \sum\limits_{k=0}^{qD-1}\zeta_q^{bk}\zeta_m^{ar(k)}
                                        \zeta_{qD}^{k\ell}\\
          & = & {1 \over 2^n} {1 \over qD}
                \sum\limits_{\ell=0}^{qD-1} (1+\zeta_{qD}^{-\ell})^n
                \sigma^{(\ell)},
\end{eqnarray*}
where $\sigma^{(\ell)}$ is defined as,
$$ \sigma^{(\ell)} \equiv_{def} 
\sum\limits_{k=0}^{qD-1}\zeta_q^{bk}\zeta_m^{ar(k)}
                                        \zeta_{qD}^{k\ell}.$$
  We determine those values of $\ell$ such that $\sigma^{(\ell)} = 0$
as follows. Using the period of $r$ once again, and re-defining variables of
summation,
\begin{eqnarray*}
  \sigma^{(\ell)} & = & 
            \sum\limits_{k=0}^{qD-1}
            \zeta_q^{-bD}\zeta_q^{b(k+D)}\zeta_m^{ar(k+D)}
            \zeta_{qD}^{-\ell D}\zeta_{qD}^{\ell(k+D)}\\
                  & = & 
            \zeta_q^{-bD}\zeta_{qD}^{-\ell D}
            \sum\limits_{k=0}^{qD-1}
              \zeta_q^{b(k+D)}\zeta_m^{ar(k+D)}\zeta_{qD}^{\ell(k+D)}\\
                  & = & 
            \zeta_q^{-bD}\zeta_{qD}^{-\ell D} \sigma^{(\ell)}.
\end{eqnarray*}
Thus,
$$(1 - \zeta_q^{-bD}\zeta_{qD}^{-\ell D}) \sigma^{(\ell)} = 0,$$
or,
$$(1 - \zeta_q^{-(\ell + bD)})\sigma^{(\ell)} = 0.$$
Now $\zeta_q^{-(\ell + bD)} = 1$ iff $\ell + bD \equiv 0~(q)$, which is true
for exactly $1/q$ of the $qD$ values of $\ell$. For any such value of $\ell$,
trivially, $|\sigma^{(\ell)}| \le qD$. For any $\ell$ such that
$\zeta_q^{-(\ell + bD)} \not = 1$, the above equation implies
$\sigma^{(\ell)} = 0$. In particular, note that $\sigma^{(0)} = 0$, so that
in the sum for $\sigma$, the $\ell = 0$ term does not contribute. This is
significant, as in \cite{CGT}, because the $\ell \not = 0$
terms are exponentially small (in $n$) compared with the $\ell = 0$ term.

  Using the triangle inequality, and making use of the observations above and
our expression for $\sigma$ up to this point, we obtain, 
\begin{eqnarray*}
  |\sigma| & \le & {1 \over 2^n qD}
                    \sum\limits_{\ell=0}^{qD-1}
                       |1+\zeta_{qD}^{-\ell}|^n \cdot |\sigma^{(\ell)}| \\
           & \le & {1 \over 2^n qD} \cdot |1+\zeta_{qD}^{-1}|^n
                   \cdot {1 \over q} \cdot qD \cdot qD \\
           & = & {|1 + \zeta_{qD}^{-1}|^n \over 2^n} \cdot D \\
           & = & \left (\cos\left({\pi \over qD}\right) \right)^n \cdot D\\
           & = & \left(D^{1 / n} \cos{\pi \over qD} \right)^n.
\end{eqnarray*}
From the well-known inequalities $e^x \ge 1+x$ and $1-{1\over 2}x^2
\ge \cos(x)$, we conclude that
     $\cos({\pi \over qD}) \le e^{-{1 \over 2} ({\pi \over qD})^2}$.
Since $n \ge (qD)^3$, 
$$D^{1/n} \cos\left({\pi \over qD}\right) \le D^{1 / (qD)^3}
                    e^{-{1 \over 2} ({\pi \over qD})^2} =
    e^{{\ln(D) \over (qD)^3} - {1 \over 2} ({\pi \over qD})^2}.$$
But it is easy to show that,
$$ {\ln(D) \over (qD)^3} - {1 \over 2} ({\pi \over qD})^2
  \le -{1 \over (qD)^2},$$
and thus,
$$ D^{1/n}\cos\left({\pi \over qD}\right) \le e^{-{1 \over (qD)^2}}.$$
Therefore,
$$ |\sigma| \le \left( D^{1/n}\cos\left({\pi \over qD}\right)\right)^n
   \le e^{-{n \over (qD)^2}} \le e^{-{n \over (qD)^3}}.$$
Finally, for $q,m \ge 2$, $\cos\left({\pi \over qm}\right) \ge
  \cos\left({\pi \over 4}\right)
= {1 \over \sqrt{2}} > {1 \over e}$, so that,
$$ |\sigma| \le \left(\cos\left({\pi \over qm}\right)\right)^{{n \over
(qD)^3}}.$$
}%%end better-sigma-bd proof


  Say a polynomial $t:\{0,1\}^n \rightarrow \Z$ is {\it block symmetric}
if it is a sum of polynomials
which are symmetric in pairwise disjoint subsets of the inputs. More precisely,
there are subsets $I_s \subseteq \{1,...,n\}$, where $s \not = s'
\Longrightarrow I_s \cap I_{s'} = \emptyset$, and corresponding polynomials
$t_s$, such that $t(x_1,...,x_n) = \sum_s t_s(x_{i_1},x_{i_2},...,x_{i_{n_s}})$,
where $n_s = |I_s|$ and $I_s = \{i_1,...,i_{n_s}\}$. The polynomials $t_s$ are
called the {\it symmetric polynomials of $t$}. We are now in a position to prove our
main theorem. As in the proof of Theorem~\ref{level-2}, the goal is to factorize
$\sigma$ into many factors of norm less than 1. However, in this case, due to the
presence of the block-symmetric polynomials, we must be content with $n^{\epsilon}$
factors for some $0 < \epsilon < 1$.

\theorem{\label{main}
  Let $q$ be prime and $m$ be a number such that $q$ does not divide $m$. Let
$C$ be a threshold-of-Mod$_m^+$ circuit, in which the polynomials defining the
Mod$_m^+$ circuits are block-symmetric (of degree ${\log^{O(1)} n}$), and where
$C$ computes the Mod$_q$ function. Then for some real number $0 \le \epsilon <
1$, the number of gates in $C$ is $2^{\Omega(n^{\epsilon})}$.
 }
\proof{ Consider one of the Mod$_m^+$ subcircuits, and let $t$ be the
polynomial in the inputs associated with that subcircuit.
  From Lemmas~\ref{discriminator} and \ref{reduction}, it is sufficient to
show that $|S(n,m,q;t)| \le 2^{-n^{\epsilon}}$. By hypothesis, $t$ is block
symmetric, so we may write,
$t(x_1,...,x_n) = \sum_s t_s(x_{i_1},x_{i_2},...,x_{i_{n_s}})$, where the
$t_s$ are the symmetric polynomials of $t$. Since $t_s$ is symmetric, it is
periodic mod $m$. Denote the period of $t_s$ mod $m$ as $D_s$. Since $D_s$ is
a polynomial in the degree of $t_s$, and since the degree of $t_s$ is bounded
by $O({\log^{O(1)} n})$, we know that $D_s = O({\log^{O(1)} n})$. From the
definitions of $S$ and $\sigma$, noting that $n = \sum_s n_s$, the sum $S$
factorizes as follows,
\begin{eqnarray*}
  S(n,m,q;t) = {1 \over m} \sum\limits_{a=0}^{m-1} \sum\limits_{b=0}^{q-1}
               (\zeta_q^{-b}-1) \prod\limits_s \sigma(n_s,m,q,a,b;t_s).
\end{eqnarray*}
Hence, using Lemma \ref{better-sigma-bd} and bounding $|\zeta_q^b - 1|$ by 2,
\begin{eqnarray*}
  |S(n,m,q;t)| & \le & 2\sum\limits_{b=1}^{q-1}
               \max\limits_{0\le a\le m-1}
                       \prod\limits_s |\sigma(n_s,m,q,a,b;t_s)|\\
       & \le & 2 \sum\limits_{b=1}^{q-1}
                  \prod\limits_s
             \left(\cos{\pi \over qm}\right)^{{n_s \over (qD_s)^3}}\\
       & \le & 2 (q-1)
                  \prod\limits_s
             \left(\cos{\pi \over qm}\right)^{{n_s \over {\log^{O(1)} n}}}\\
     & = & 2 (q-1)
                 \left(\cos{\pi \over qm}\right)^{{1 \over{\log^{O(1)} n}}
                                   \sum\limits_s n_s}\\
   & = & 2(q-1)\left(\cos{\pi \over qm}\right)^{{n \over {\log^{O(1)} n}}}\\
     & \le & 2^{-\Omega(n^{\epsilon})},
\end{eqnarray*}
for some $\epsilon < 1$.
}

\section{Discussion}

  It would of course be most desirable to obtain exponentially small upper
bounds for the quantity $|S(n,q,m;t)|$ for {\it any} low-degree polynomial
$t$. While obtaining good bounds for exponential sums can be very difficult,
the more general results entailing elaborate techniques and deep results from
algebraic geometry \cite{Katz} and/or analysis \cite{AKC}, the results of
this paper offer encouragement that the required techniques will not be quite
that hard. This is suggested by the following observations.

The techniques for estimating exponential sums in number theory are quite
sophisticated even for univariate polynomials. In
a sense the polynomials here are like univariate polynomials, since,
being symmetric, they only depend on the sum of the inputs. Nevertheless, the
techniques employed here are of an elementary nature, relying (for example)
on the Chinese remainder theorem and properties of the binomial coefficients
in $\Z/m\Z$. One of the apparent reasons for the differing techniques is
that here we are interested in a different type of asymptotic estimate. We
fix the degree of $t$ (or let it grow very slowly), allowing the number of
variables to approach infinity. By contrast in the theory of equations over
finite fields, the interest is in obtaining estimates in successively larger
finite extensions of a fixed base field, the polynomial (along with the
number of inputs) remaining fixed \cite{LN}. More importantly, for the known
techniques to apply, the polynomials have different kinds of constraints
imposed on them. For example, a wide class of results (see \cite{LN},
\cite{Sch}) holds if the polynomial is ``absolutely irreducible," i.e.,
irreducible in every finite extension of the base field. No such assumptions
were required here or in \cite{CGT}. The techniques employed here hold for
any low-degree block-symmetric polynomial, including those that are
reducible. While the known algebraic techniques will very likely play a role,
we are clearly facing a problem of a different nature.\\

\noindent{\bf Acknowledgements:} I wish to thank Jin-Yi Cai and Thomas
Thierauf for helpful conversations on this subject. The hospitality of
Steve Homer and the Computer Science Department of Boston University, where
part of this work was done while the author was on sabbatical from Clark
University, is gratefully acknowledged.

\begin{thebibliography}{1}
 
\bibitem[AKC]{AKC}
{\sc G.~I.~Arhipov, A.~A.~Karacuba, V.~N.~Cubarikov,} Multiple trigonometric
sums, {\it Proceedings of the Steklov Institute of Mathematics,} volume 151,
American Mathematical Society, Providence, 1982.

\bibitem[Al]{Allender}
{\sc E.~Allender,} A note on the power of threshold
circuits.  In {\it Proceedings of the 30th Symposium on
Foundations of Computer Science}, (1989), 580-584.

\bibitem[Bar]{Bar}{\sc D. Barrington}, Bounded-width polynomial-size
branching programs recognize exactly those languages in
NC$^1$, in {\it Journal of Computer and System
Sciences} {\bf 38}, (1989), 150-164.

\bibitem[BBR]{BBR}
{\sc D.~A.~M. Barrington, R. Beigel, and S. Rudich,}
Representing Boolean functions as polynomials modulo composite numbers,
in {\it Computational Complexity} {\bf 4} (1994) 367-382.

\bibitem[BT]{BT}
{\sc R.~Beigel and J.~Tarui,} On ACC, in {\it Computational Complexity}
{\bf 4} (1994) 350--366.

\bibitem[Cai]{Cai}
{\sc J.-Y. Cai,}
With probability one, a random oracle separates PSPACE from the
polynomial-time hierarchy  {\em Journal of Computer and System Science}
{\bf 38} (1989) 68-85.

\bibitem[CGT]{CGT}{\sc J.-Y.~Cai, F.~Green, and T.~Thierauf}, On the
correlation of symmetric functions, in {\it Mathematical Systems Theory} {\bf
29} (1996) 245-258.

\bibitem[FSS]{FSS}{\sc M.~Furst, J.~B.~Saxe, and M.~Sipser,} Parity,
circuits, and the polynomial-time hierarchy, in {\it Mathematical Systems
Theory,} {\bf 17} (1984) 13-27.

\bibitem[Go]{Go} {\sc M.~Goldmann,} A note on the power
of majority gates and modular gates, in {\it Information
Processing Letter,} {\bf 53} (1995), 321-327.

\bibitem[GKRST]{GKRST}
{\sc F.~Green, J.~K{\"o}bler, K.~Regan, T.~Schwentick, and J.~Tor{\'a}n,} The
power of the middle bit of a {\#}P function, in {\it Journal of
Computer and System Sciences} {\bf 50} (1995) 456-467.

\bibitem[Gr]{Gr91}
{\sc F. Green,} An oracle separating $\oplus$P from
PP$^{\PH}$, {\em Information Processing Letters} {\bf 37} (1991) 149-153.

\bibitem[Has]{Has}
{\sc J. H{\aa}stad,}
Computational limitations of small-depth circuits, the MIT press,
Cambridge, 1987.

\bibitem[HG]{HG} {\sc J.~H{\aa}stad and M.~Goldmann,} On the power of
small-depth threshold circuits, in {\it Computational Complexity,} {\bf 1}
(1991) 113-129.

\bibitem[HMPST]{HMPST}
{\sc A. Hajnal, W. Maass, P. Pudl{\'a}k, M. Szegedy, and
G. Tur{\'a}n,} Threshold circuits of bounded depth, in
{\em Proceedings 28th Annual IEEE Symposium on Foundations of
Computer Science}, IEEE Computer Society Press (1987) 99-110.
 
\bibitem[Ka]{Katz} {\sc N.~Katz,} Sommes exponentielles, {\it Ast{\'e}risque}
{\bf 79} (1980).
 
\bibitem[KP]{KP}
{\sc M.~Krause and P.~Pudl{\'a}k,} On the computational
power of depth 2 circuits with threshold and modulo gates,
in {\em Proceedings of the Twenty-Sixth Annual ACM
Symposium on the Theory of Computing,} ACM Press (1994)
48-57.

\bibitem[LN]{LN}
{\sc R.~Lidl and H.~Niederreiter,} Finite Fields, {\em
Encyclopedia of Mathematics and its Applications,}
Vol.~20, Cambridge University Press, Cambridge, 1983.

\bibitem[Raz]{Raz}
{\sc A. A. Razborov,} Lower bounds on the size of bounded
depth networks over a complete basis with logical addition, {\em
Matematicheskie Zametki} {\bf 41} (1987) 598-607. English translation in {\em
Mathematical Notes of the Academy of Sciences of the USSR} {\bf 41} (1987)
333-338.

\bibitem[Sch]{Sch}
{\sc W. M. Schmidt,}
Equations over finite fields: An elementary approach, {\em Lecture Notes in 
Mathematics},
vol. 536, Springer, New York, 1976.

\bibitem[Sm]{Sm}{\sc R. Smolensky,} Algebraic methods in the
theory of lower bounds for Boolean circuit complexity, in {\em
Proceedings of the 19th Annual ACM Symposium on Theory of
Computing} (1987) 77-82.

\bibitem[Tod]{Toda}{\sc S.~Toda.}
PP is as hard as the polynomial-time hierarchy.  In {\it
SIAM Journal on Computing} {\bf 20}, (1991) 865-877.

\bibitem[Yao~85]{Yao1}
 {\sc A.C. Yao,} Separating the polynomial-time hierarchy
by oracles, in {\em Proceedings of the 26th Annual IEEE Symposium on Foundations
of Computer Science} (1985) 1-10.
 
\bibitem[Yao~90]{Yao2}{\sc A. Yao,}
On ACC and threshold circuits.  In {\it Proceedings of
the 31st Symposium on Foundations of Computer Science},
(1990), 619-627.



\end{thebibliography}

\bigskip

\noindent{\Large\bf Appendix: Proof of Lemma~\ref{basic-sigma-bd}}

Performing the sum over $x_1$,
\begin{eqnarray*}
  \sigma & = & {1\over 2^n} \sum\limits_{(x_2,...,x_n) \in \{0,1\}^{n-1}}
         \left( \zeta_q^{(b\sum_{i=2}^n x_i)}\zeta_m^{at(0,x_2,...,x_n)} +
                \zeta_q^b \zeta_q^{(b\sum_{i=2}^n
                        x_i)}\zeta_m^{at(1,x_2,...,x_n)} \right)\\
   & = & {1\over 2^n} \sum\limits_{(x_2,...,x_n) \in \{0,1\}^{n-1}}
        \zeta_q^{(b\sum_{i=2}^n x_i)}\zeta_m^{at(0,x_2,...,x_n)} \cdot
     \left(1 +
\zeta_q^b\zeta_m^{a(t(1,x_2,...,x_n)-t(0,x_2,...,x_n))}\right).
\end{eqnarray*}
Then by the triangle inequality,
\begin{eqnarray*}
  |\sigma| 
    \le  {1\over 2^n} \sum\limits_{(x_2,...,x_n) \in \{0,1\}^{n-1}}
\left|1 +
\zeta_q^b\zeta_m^{a(t(1,x_2,...,x_n)-t(0,x_2,...,x_n))}\right|.
\end{eqnarray*}
For notational convenience, let $\Delta t$ denote 
$t(1,x_2,...,x_n)-t(0,x_2,...,x_n)$.
Now $\zeta_q^b \zeta_m^{a\Delta t} =
           \zeta_{qm}^{bm+qa\Delta t}$.
Furthermore, $\zeta_{qm}^{bm+qa\Delta t}=1$
if and only if $bm+qa\Delta t \equiv 0~(qm)$. But if the latter holds, then
$q$ must divide either $b$ or $m$, neither of which is possible. Hence,
$$\zeta_{qm}^{bm+qa\Delta t}\not =1,$$
so the maximum value that $|1+\zeta_{qm}^{bm+qa\Delta t}|$ can be is
$|1+\zeta_{qm}|$. Hence, from the foregoing inequality for $|\sigma|$,
\begin{eqnarray*}
|\sigma| & \le & {1 \over 2^n} |1+\zeta_{qm}| \sum\limits_{(x_2,...,x_n) \in
             \{0,1\}^{n-1}} 1\\
     & = & {|1 + \zeta_{qm}| \over 2}\\
     & = & \cos\left({\pi \over qm}\right).
\end{eqnarray*}





\end{document}
