Re: [PATCH] mailsplit and mailinfo: gracefully handle NUL characters | |
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] | |
Hi,
On Fri, 16 May 2008, Brian Foster wrote:
> Johannes Schindelin suggested:
> > The function fgets() has a big problem with NUL characters: it reads
> > them, but nobody will know if the NUL comes from the file stream, or
> > was appended at the end of the line.
> >
> > So implement a custom read_line() function.
> ^^^^^^^^^^^
> read_line_with_nul()
> meaning read part or all of one line which may contain NULs.
Right.
> >[ ... ]
> > diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c
> > index 46b27cd..021dc16 100644
> > --- a/builtin-mailsplit.c
> > +++ b/builtin-mailsplit.c
> > @@ -45,6 +45,25 @@ static int is_from_line(const char *line, int len)
> > /* Could be as small as 64, enough to hold a Unix "From " line. */
> > static char buf[4096];
> >
> > +/* We cannot use fgets() because our lines can contain NULs */
> > +int read_line_with_nul(char *buf, int size, FILE *in)
> > +{
> > + int len = 0, c;
> > +
> > + for (;;) {
> > + c = fgetc(in);
> > + buf[len++] = c;
> > + if (c == EOF || c == '\n' || len + 1 >= size)
> > + break;
> > + }
> > +
> > + if (c == EOF)
> > + len--;
> > + buf[len] = '\0';
> > +
> > + return len;
>
> when fgetc(3) — why not use getc(3)? -
Because mailsplit can read from a file, too.
> returns EOF it is pointlessly stored in buf[] (as a 'char'!), len's
> advanced, and then the storage and advancing are undone. isn't that a
> bit silly?
I left it at that, because it is a rare case, the buffer has to be
accessed with the trailing NUL anyway, and I think it is worth to have
this function quite readable. I, for one, am pretty certain that I
understand what this function does, and how, in 6 months from now, without
any additional documentation.
> untested:
>
> assert(2 <= size);
> do {
> if ((c = getc(in)) == EOF)
> break;
> } while (((buf[len++] = c) != '\n' && len+1 < size);
> buf[len] = '\0'
>
> return len;
... except this is unreadable at best ;-)
> I'd tend to write this in terms of pointers,
> something along the lines (untested):
>
> char *p, *endp;
>
> assert(1 <= size);
> p = buf;
> endp = p + (size-1);
> while (p < endp) {
> if ((c = getc(in)) == EOF || (*p++ = c) == '\n')
> break;
> }
> *p = '\0';
>
> return p - buf;
Again, I think this is too cuddled. You have to think about every second
line, and that makes for stupid mistakes with this developer.
Ciao,
Dscho
[Newbies FAQ] [Kernel List] [Site Home] [Free Online Dating] [Gcc Help] [IETF Annouce] [DCCP] [Netdev] [Networking] [Security] [V4L] [Bugtraq] [Free Online Dating] [Rubini] [Photo] [Yosemite] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Linux SCSI] [DDR & Rambus] [Linux Resources]