Can a 32 bit GCC running on a 64 bit Intel be made to emit SSE SIMD instructions automatically?

what is given, unchangeably, is a 32 bit GNU/Linux
on 64 bit Intel hardware with SSE3.
The systems have a 32 bit GCC (from the OS distribution),
4.6.1 or 4.6.3. Some things are in C, others are in Ada.

I am targeting a function like the one below (semi-pseudocode)
and try to have GCC automatically generate SSE SIMD instructions
such as mulpd, or divpd for the above system. Automatically meaning
*without* importing builtins/intrinsics, explicitly or via #include.
This works just fine using attributes when everything
is 64 bits, using -O3 -msse3 -mfpmath=sse.
I cannot get it to work at all when OS and GCC are 32 bit.
(Using intrinsics explicitly does work, sort of, but that's
beside the point, if I may say so.)

So, is there any way to make a 32 bit GCC emit mulpd and
the like automatically (and correctly), without explicitly
calling built-ins, like there is when everything is 64 bits?
(Maybe: Does the vectorizer, if applicable, recognize that
it may assume presence of suitable hardware and act

VECTOR f(double a, double b, double c, double d)
  return (VECTOR) { a OP c, b OP d };

(VECTOR is an array of two 64 bit FPT components,
in all GCC languages I have tried. It works nicely, as said,
with 64 bit compilers.)


