Commit Graph

2291 Commits

Author SHA1 Message Date
Yasuyuki Kozakai f0daaa654a [NETFILTER] ip6tables: whitespace and indent cosmetic cleanup
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:39:39 -08:00
Yasuyuki Kozakai 6dd42af790 [NETFILTER] Makefile cleanup
These are replaced with x_tables matches and no longer exist.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:38:56 -08:00
Benoit Boissinot ccc91324a1 [NETFILTER] ip[6]t_policy: Fix compilation warnings
ip[6]t_policy argument conversion slipped when merging with x_tables

Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:26:34 -08:00
Kris Katterjohn e35bedf369 [NET]: Fix whitespace issues in net/core/filter.c
This fixes some whitespace issues in net/core/filter.c

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:25:52 -08:00
Amnon Aaronsohn dd914b4082 [PKT_SCHED] sch_prio: fix qdisc bands init
Currently when PRIO is configured to use N bands, it lets the packets be
directed to any of the bands 0..N-1. However, PRIO attaches a fifo qdisc
only to the bands that appear in the priomap; the rest of the N bands
remain with a noop qdisc attached. This patch changes PRIO's behavior so
that it attaches a fifo qdisc to all of the N bands.

Signed-off-by: Amnon Aaronsohn <bla@cs.huji.ac.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:24:26 -08:00
YOSHIFUJI Hideaki 9343e79a7b [IPV6]: Preserve procfs IPV6 address output format
Procfs always output IPV6 addresses without the colon
characters, and we cannot change that.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:10:53 -08:00
Linus Torvalds caf5b04c82 x86: Work around compiler code generation bug with -Os
Some versions of gcc generate incorrect code for the inet_check_attr()
function, apparently due to a totally bogus index -> pointer comparison
transformation.

At least "gcc version 4.0.1 20050727 (Red Hat 4.0.1-5)" from FC4 is
affected, possibly others too.

This changes the function subtly so that the buggy gcc transformation
doesn't trigger.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14 22:08:28 -08:00
Arjan van de Ven 858119e159 [PATCH] Unlinline a bunch of other functions
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14 18:27:06 -08:00
David S. Miller 37d8dc82e0 [NETFILTER] x-tables: Missing linux/ipv6.h includes.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 16:19:44 -08:00
Patrick McHardy dca80b962a [PKT_SCHED]: Change default clock source to gettimeofday
The default of using jiffies is very bad and results in
underutilization except with very low bandwidth.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:36:55 -08:00
Patrick McHardy ee51b1b6ce [XFRM]: IPsec tunnel wildcard address support
When the source address of a tunnel is given as 0.0.0.0 do a routing lookup
to get the real source address for the destination and fill that into the
acquire message. This allows to specify policies like this:

spdadd 172.16.128.13/32 172.16.0.0/20 any -P out ipsec
        esp/tunnel/0.0.0.0-x.x.x.x/require;
spdadd 172.16.0.0/20 172.16.128.13/32 any -P in ipsec
        esp/tunnel/x.x.x.x-0.0.0.0/require;

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:34:36 -08:00
Kris Katterjohn 7b11f69fb5 [NET]: Clean up comments for sk_chk_filter()
This removes redundant comments, and moves one comment to a better
location.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:33:06 -08:00
Joe Perches 46b86a2da0 [NET]: Use NIP6_FMT in kernel.h
There are errors and inconsistency in the display of NIP6 strings.
	ie: net/ipv6/ip6_flowlabel.c

There are errors and inconsistency in the display of NIPQUAD strings too.
	ie: net/netfilter/nf_conntrack_ftp.c

This patch:
	adds NIP6_FMT to kernel.h
	changes all code to use NIP6_FMT
	fixes net/ipv6/ip6_flowlabel.c
	adds NIPQUAD_FMT to kernel.h
	fixes net/netfilter/nf_conntrack_ftp.c
	changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:29:07 -08:00
Per Liden 23b0ca5bf5 [PATCH] genetlink: don't touch module ref count
Increasing the module ref count at registration will block the module from
ever being unloaded. In fact, genetlink should not care about the owner at
all. This patch removes the owner field from the struct registered with
genetlink.

Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 13:06:40 -08:00
Harald Welte 2e4e6a17af [NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables
This monster-patch tries to do the best job for unifying the data
structures and backend interfaces for the three evil clones ip_tables,
ip6_tables and arp_tables.  In an ideal world we would never have
allowed this kind of copy+paste programming... but well, our world
isn't (yet?) ideal.

o introduce a new x_tables module
o {ip,arp,ip6}_tables depend on this x_tables module
o registration functions for tables, matches and targets are only
  wrappers around x_tables provided functions
o all matches/targets that are used from ip_tables and ip6_tables
  are now implemented as xt_FOOBAR.c files and provide module aliases
  to ipt_FOOBAR and ip6t_FOOBAR
o header files for xt_matches are in include/linux/netfilter/,
  include/linux/netfilter_{ipv4,ipv6} contains compatibility wrappers
  around the xt_FOOBAR.h headers

Based on this patchset we're going to further unify the code,
gradually getting rid of all the layer 3 specific assumptions.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-12 14:06:43 -08:00
David S. Miller 880b005f29 [TIPC]: Fix 64-bit build warnings.
When storing u32 values in a pointer, need to do
some long casts to keep GCC happy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-12 14:06:41 -08:00
Per Liden 593a5f22d8 [TIPC] More updates of file headers
Updated copyright notice to include the year the file was
actually created. Information about file creation dates
was extracted from the files in the old CVS repository
at tipc.sourceforge.net.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:39 -08:00
Per Liden 9da1c8b694 [TIPC] Update of file headers
The copyright statements from different parts of Ericsson
have been merged into one.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:38 -08:00
Per Liden d0a14a9dbd [TIPC] Cleaned up info/warn/err macros
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:37 -08:00
Per Liden 9ea1fd3c1a [TIPC] License header update
The license header in each file now more clearly state that this
code is licensed under a dual BSD/GPL. Before this was only
evident if you looked at the MODULE_LICENSE line in core.c.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:36 -08:00
Per Liden ea714ccda5 [TIPC] Moved configuration interface into tipc_config.h
Restored the old tipc_config.h to get a cleaner division between the
interfaces used by normal TIPC users and TIPC administration utilities.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:35 -08:00
Jon Maloy b70e4f45a8 [TIPC} Fixed bug in disc_timeout()
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
2006-01-12 14:06:33 -08:00
Per Liden 1dba974333 [TIPC] Use dynamically allocated family id with NETLINK_GENERIC
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:32 -08:00
Per Liden b97bf3fd8f [TIPC] Initial merge
TIPC (Transparent Inter Process Communication) is a protocol designed for
intra cluster communication. For more information see
http://tipc.sourceforge.net

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:31 -08:00
Randy Dunlap 4fc268d24c [PATCH] capable/capability.h (net/)
net: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Adrian Bunk bb7e8c5a55 [PKT_SCHED] net/sched/Kconfig: fix typo in NET_EMATCH_META description
Noted by Matt LaPlante <webmaster@cyberdogtech.com>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:40:30 -08:00
Evgeniy Polyakov 54608b7099 [PKT_SCHED] ematch: Remove bogus include.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:16 -08:00
Evgeniy Polyakov c3f343e4d7 [NET]: Fix diverter build.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:15 -08:00
Kris Katterjohn 8b3a70058b [NET]: Remove more unneeded typecasts on *malloc()
This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:14 -08:00
David Woodhouse ae0f7d5f83 [IPV6]: Avoid calling ip6_xmit() with NULL sk
The ip6_xmit() function now assumes that its sk argument is non-NULL,
which isn't currently true when TCPv6 code is sending RST or ACK
packets. This fixes that code to use a socket of its own for sending
such packets, as TCPv4 does. (Thanks Andi for the pointer).

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:13 -08:00
David S. Miller a776809755 [NETFILTER]: ip_ct_proto_gre_fini() cannot be __exit
It is invoked from failures paths of __init code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:12 -08:00
David S. Miller 82bf7e97ac [NET]: Some more missing include/etherdevice.h includes
For compare_ether_addr()

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:11 -08:00
David S. Miller 5bf887f2ff [IPV6]: Fix modular build with netfilter enabled.
Also, drop __exit marker from ipv6_netfilter_fini() as this
can be invoked from inet6_init() error handling paths.

Based upon a report from Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 21:02:21 -08:00
Linus Torvalds 9819d85c21 Fix net/core/wireless.c link failure
It needs <linux/etherdevice.h> for compare_ether_addr()
2006-01-10 19:35:19 -08:00
Nicolas Kaiser b8ab50bc55 netfilter: headers included twice
Headers included twice.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-11 02:04:35 +01:00
Bart De Schuymer 8a4c8a96a4 [EBTABLES] Don't match tcp/udp source/destination port for IP fragments
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:12:22 -08:00
Jesper Juhl 12fe2c588d [NET]: Remove unneeded kmalloc() return value casts
Get rid of needless casting of kmalloc() return value in net/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:08:21 -08:00
Jesper Juhl ea2e90dfce [RXRPC]: Decrease number of pointer derefs in connection.c
Decrease the number of pointer derefs in net/rxrpc/connection.c

Benefits of the patch:
 - Fewer pointer dereferences should make the code slightly faster.
 - Size of generated code is smaller
 - improved readability

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:07:44 -08:00
Martin Murray ad8e4b75c8 [AF_NETLINK]: Fix DoS in netlink_rcv_skb()
From: Martin Murray <murrayma@citi.umich.edu>

Sanity check nlmsg_len during netlink_rcv_skb.  An nlmsg_len == 0 can
cause infinite loop in kernel, effectively DoSing machine.  Noted by
Matin Murray.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:02:29 -08:00
Patrick McHardy babbdb1a18 [NETFILTER]: Fix timeout sysctls on big-endian 64bit architectures
The connection tracking timeout variables are unsigned long, but
proc_dointvec_jiffies is used with sizeof(unsigned int) in the sysctl
tables. Since there is no proc_doulongvec_jiffies function, change the
timeout variables to unsigned int.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:35 -08:00
Patrick McHardy 9d28026b7e [NETFILTER]: Remove unused function from NAT protocol helpers
->print and ->print_range are not used (and apparently never were).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:34 -08:00
Patrick McHardy c07bc1ffbd [NETFILTER]: Fix return value confusion in PPTP NAT helper
ip_nat_mangle_tcp_packet doesn't return NF_* values but 0/1 for
failure/success.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:33 -08:00
Patrick McHardy 03b9feca89 [NETFILTER]: Fix another crash in ip_nat_pptp
The PPTP NAT helper calculates the offset at which the packet needs
to be mangled as difference between two pointers to the header. With
non-linear skbs however the pointers may point to two seperate buffers
on the stack and the calculation results in a wrong offset beeing
used.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:32 -08:00
Patrick McHardy 15db34702c [NETFILTER]: Fix crash in ip_nat_pptp
When an inbound PPTP_IN_CALL_REQUEST packet is received the
PPTP NAT helper uses a NULL pointer in pointer arithmentic to
calculate the offset in the packet which needs to be mangled
and corrupts random memory or crashes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:30 -08:00
Patrick McHardy bb94aa169e [NETFILTER]: net/ipv[46]/netfilter.c cleanups
Don't wrap entire file in #ifdef CONFIG_NETFILTER, remove a few
unneccessary includes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:29 -08:00
Kris Katterjohn d3f4a687f6 [NET]: Change memcmp(,,ETH_ALEN) to compare_ether_addr()
This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two).

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:28 -08:00
Linus Torvalds 4f47707b05 Fix rpc shutdown event condition bug
We want to wait for the cl_users to go down to zero, not for it to stay
positive.  Quoth Trond (who wasn't even the author, but acked the wrong
version): "Argh! I need to increase my daily caffeine dosages."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:56:39 -08:00
Alan Cox 33f0f88f1c [PATCH] TTY layer buffering revamp
The API and code have been through various bits of initial review by
serial driver people but they definitely need to live somewhere for a
while so the unconverted drivers can get knocked into shape, existing
drivers that have been updated can be better tuned and bugs whacked out.

This replaces the tty flip buffers with kmalloc objects in rings. In the
normal situation for an IRQ driven serial port at typical speeds the
behaviour is pretty much the same, two buffers end up allocated and the
kernel cycles between them as before.

When there are delays or at high speed we now behave far better as the
buffer pool can grow a bit rather than lose characters. This also means
that we can operate at higher speeds reliably.

For drivers that receive characters in blocks (DMA based, USB and
especially virtualisation) the layer allows a lot of driver specific
code that works around the tty layer with private secondary queues to be
removed. The IBM folks need this sort of layer, the smart serial port
people do, the virtualisers do (because a virtualised tty typically
operates at infinite speed rather than emulating 9600 baud).

Finally many drivers had invalid and unsafe attempts to avoid buffer
overflows by directly invoking tty methods extracted out of the innards
of work queue structs. These are no longer needed and all go away. That
fixes various random hangs with serial ports on overflow.

The other change in here is to optimise the receive_room path that is
used by some callers. It turns out that only one ldisc uses receive room
except asa constant and it updates it far far less than the value is
read. We thus make it a variable not a function call.

I expect the code to contain bugs due to the size alone but I'll be
watching and squashing them and feeding out new patches as it goes.

Because the buffers now dynamically expand you should only run out of
buffering when the kernel runs out of memory for real.  That means a lot of
the horrible hacks high performance drivers used to do just aren't needed any
more.

Description:

tty_insert_flip_char is an old API and continues to work as before, as does
tty_flip_buffer_push() [this is why many drivers dont need modification].  It
does now also return the number of chars inserted

There are also

tty_buffer_request_room(tty, len)

which asks for a buffer block of the length requested and returns the space
found.  This improves efficiency with hardware that knows how much to
transfer.

and tty_insert_flip_string_flags(tty, str, flags, len)

to insert a string of characters and flags

For a smart interface the usual code is

    len = tty_request_buffer_room(tty, amount_hardware_says);
    tty_insert_flip_string(tty, buffer_from_card, len);

More description!

At the moment tty buffers are attached directly to the tty.  This is causing a
lot of the problems related to tty layer locking, also problems at high speed
and also with bursty data (such as occurs in virtualised environments)

I'm working on ripping out the flip buffers and replacing them with a pool of
dynamically allocated buffers.  This allows both for old style "byte I/O"
devices and also helps virtualisation and smart devices where large blocks of
data suddenely materialise and need storing.

So far so good.  Lots of drivers reference tty->flip.*.  Several of them also
call directly and unsafely into function pointers it provides.  This will all
break.  Most drivers can use tty_insert_flip_char which can be kept as an API
but others need more.

At the moment I've added the following interfaces, if people think more will
be needed now is a good time to say

 int tty_buffer_request_room(tty, size)

Try and ensure at least size bytes are available, returns actual room (may be
zero).  At the moment it just uses the flipbuf space but that will change.
Repeated calls without characters being added are not cumulative.  (ie if you
call it with 1, 1, 1, and then 4 you'll have four characters of space.  The
other functions will also try and grow buffers in future but this will be a
more efficient way when you know block sizes.

 int tty_insert_flip_char(tty, ch, flag)

As before insert a character if there is room.  Now returns 1 for success, 0
for failure.

 int tty_insert_flip_string(tty, str, len)

Insert a block of non error characters.  Returns the number inserted.

 int tty_prepare_flip_string(tty, strptr, len)

Adjust the buffer to allow len characters to be added.  Returns a buffer
pointer in strptr and the length available.  This allows for hardware that
needs to use functions like insl or mencpy_fromio.

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:59 -08:00
Ingo Molnar 532347e2bb [PATCH] nfs: sleep_on() removal
Convert sleep_on() to wait_event_timeout().  Probably safe with the BKL but
could be racy once BKL use in NFS-client is gone.

Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:42 -08:00
Andrey Borzenkov 6dd214b554 [PATCH] fix /sys/class/net/<if>/wireless without dev->get_wireless_stats
dev->get_wireless_stats is deprecated but removing it also removes wireless
subdirectory in sysfs. This patch puts it back.

akpm: I don't know what's happening here.  This might be appropriate as a
2.6.15.x compatibility backport.  Waiting to hear from Jeff.

Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:24 -08:00
Linus Torvalds 80c0531514 Merge master.kernel.org:/pub/scm/linux/kernel/git/mingo/mutex-2.6 2006-01-09 17:31:38 -08:00
Linus Torvalds a457aa6c2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2006-01-09 17:06:53 -08:00
Jes Sorensen 1b1dcc1b57 [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.

Modified-by: Ingo Molnar <mingo@elte.hu>

(finished the conversion)

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-09 15:59:24 -08:00
Adrian Bunk 93b1fae491 spelling: s/trough/through/
Additionally, one comment was reformulated by Joe Perches <joe@perches.com>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-10 00:13:33 +01:00
Arnaldo Carvalho de Melo dff2c03534 [INET_DIAG]: Introduce sk_diag_fill
To be called from inet_diag_get_exact, also rename inet_diag_fill to
inet_csk_diag_fill, for consistency with inet_twsk_diag_fill.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:56 -08:00
Arnaldo Carvalho de Melo c7d58aabdc [INET_DIAG]: Introduce inet_twsk_diag_dump & inet_twsk_diag_fill
To properly dump TIME_WAIT sockets and to reduce complexity a bit by
having per socket class accessor routines.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:38 -08:00
Arnaldo Carvalho de Melo 4e852c0279 [INET_DIAG]: whitespace/simple cleanups
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:19 -08:00
Arnaldo Carvalho de Melo 7dbf075524 [INET_DIAG]: Use inet_twsk() with TIME_WAIT sockets
The fields being accessed in inet_diag_dump are outside sock_common, the
common part of struct sock and struct inet_timewait_sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:03 -08:00
Patrick McHardy a2c2064f7f [IPV6]: Set skb->priority in ip6_output.c
Set skb->priority = sk->sk_priority as in raw.c and IPv4.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:31 -08:00
Patrick McHardy cfacb0577e [IPV4]: ip_output.c needs xfrm.h
This patch fixes a warning from my IPsec patches:

   CC      net/ipv4/ip_output.o
net/ipv4/ip_output.c: In function 'ip_finish_output':
net/ipv4/ip_output.c:208: warning: implicit declaration of function
'xfrm4_output_finish'

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:28 -08:00
Jamal Hadi Salim 29f1df6cc1 [PKT_SCHED]: Fix qdisc return code.
The mapping between TC_ACTION_SHOT and the qdisc return codes is better
suited to NET_XMIT_BYPASS so as not to confuse TCP

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:26 -08:00
Kris Katterjohn 09a626600b [NET]: Change some "if (x) BUG();" to "BUG_ON(x);"
This changes some simple "if (x) BUG();" statements to "BUG_ON(x);"

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:18 -08:00
Patrick McHardy 4bba392592 [PKT_SCHED]: Prefix tc actions with act_
Clean up the net/sched directory a bit by prefix all actions with act_.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:14 -08:00
Patrick McHardy 541673c859 [PKT_SCHED]: Fix memory leak when dumping in pedit action
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:12 -08:00
Patrick McHardy 31bd06eb33 [PKT_SCHED]: Remove some obsolete policer exports
Also make sure the legacy code is only built when CONFIG_NET_CLS_ACT
is not set.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:10 -08:00
Patrick McHardy f43c5a0df3 [PKT_SCHED]: Convert tc action functions to single skb pointers
tcf_action_exec only gets a single skb pointer and doesn't own the skb,
but passes double skb pointers (to a local variable) to the action
functions. Change to use single skb pointers everywhere.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:08 -08:00
Patrick McHardy 538e43a4bd [PKT_SCHED]: Use USEC_PER_SEC
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:05 -08:00
Patrick McHardy 2941a48631 [NET]: Convert net/{ipv4,ipv6,sched} to netdev_priv
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:03 -08:00
Linus Torvalds cf10b2853f Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-01-09 09:39:05 -08:00
Kirill Korotaev 14591de147 [PATCH] netlink oops fix due to incorrect error code
Fixed oops after failed netlink socket creation.

Wrong parathenses in if() statement caused err to be 1,
instead of negative value.

Trivial fix, not trivial to find though.

Signed-Off-By: Dmitry Mishin <dim@sw.ru>
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-Off-By: Linus Torvalds <torvalds@osdl.org>
2006-01-09 09:36:52 -08:00
Johannes Berg a4bf26f30e [PATCH] ieee80211: enable hw wep where host has to build IV
This patch fixes some of the ieee80211 crypto related code so that
instead of having the host fully do crypto operations, the host_build_iv
flag works properly (for WEP in this patch) which, if turned on,
requires the hardware to do all crypto operations, but the ieee80211
layer builds the IV. The hardware also has to build the ICV.

Previously, the host_build_iv flag couldn't be used at all for WEP, and
not alone (with both host_decrypt and host_encrypt disabled) because the
crypto algorithm wasn't assigned. This is also fixed.

I have tested this patch both in host crypto mode and in hw crypto mode
(with the Broadcom chipset).

[resent, signing digitally caused it to be MIME-junked, sorry]

Signed-Off-By: Johannes Berg <johannes@sipsolutions.net>

Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2006-01-09 10:34:25 -05:00
Matt Mackall 18e92b12e8 [PATCH] tiny: Trim non-IPX builds
trivial: drop unused 802.3 code if we compile without IPX

(originally from http://wohnheim.fh-wedel.de/~joern/software/kernel/je/25/)

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:14:10 -08:00
Eric Dumazet 5160ee6fc8 [PATCH] shrink dentry struct
Some long time ago, dentry struct was carefully tuned so that on 32 bits
UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
of memory cache lines.

Then RCU was added and dentry struct enlarged by two pointers, with nice
results for SMP, but not so good on UP, because breaking the above tuning
(128 + 8 = 136 bytes)

This patch reverts this unwanted side effect, by using an union (d_u),
where d_rcu and d_child are placed so that these two fields can share their
memory needs.

At the time d_free() is called (and d_rcu is really used), d_child is known
to be empty and not touched by the dentry freeing.

Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
the previous content of d_child is not needed if said dentry was unhashed
but still accessed by a CPU because of RCU constraints)

As dentry cache easily contains millions of entries, a size reduction is
worth the extra complexity of the ugly C union.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Paul Jackson <pj@sgi.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@epoch.ncsc.mil>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:58 -08:00
Pekka Enberg f9f7500521 [PATCH] slab: remove unused align parameter from alloc_percpu
__alloc_percpu and alloc_percpu both take an 'align' argument which is
completely ignored.  snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use
it, but it will be ignored.  Therefore, remove the 'align' argument and fixup
the lone caller.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:12:39 -08:00
Adrian Bunk 9f5336e218 [IPV6]: small cleanups
This patch contains the following cleanups:
- addrconf.c: make addrconf_dad_stop() static
- inet6_connection_sock.c should #include <net/inet6_connection_sock.h>
  for getting the prototypes of it's global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 13:24:25 -08:00
Adrian Bunk 97dc627fb3 [IPV4]: make ip_fragment() static
Since there's no longer any external user of ip_fragment() we can make 
it static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 13:23:39 -08:00
Joe Kappus da7bc6ee8e [NETFILTER]: ip_conntrack_proto_sctp.c needs linux/interrupt.h
Signed-off-by: Joe Kappus <joecool1029@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:41 -08:00
Patrick McHardy e16a8f0b8c [NETFILTER]: Add ipt_policy/ip6t_policy matches
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:38 -08:00
Patrick McHardy eb9c7ebe69 [NETFILTER]: Handle NAT in IPsec policy checks
Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi
of the original packet from the conntrack information for IPsec policy
checks.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:37 -08:00
Patrick McHardy b59c270104 [NETFILTER]: Keep conntrack reference until IPsec policy checks are done
Keep the conntrack reference until policy checks have been performed for
IPsec NAT support. The reference needs to be dropped before a packet is
queued to avoid having the conntrack module unloadable.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:36 -08:00
Patrick McHardy 5c901daaea [NETFILTER]: Redo policy lookups after NAT when neccessary
When NAT changes the key used for the xfrm lookup it needs to be done
again. If a new policy is returned in POST_ROUTING the packet needs
to be passed to xfrm4_output_one manually after all hooks were called
because POST_ROUTING is called with fixed okfn (ip_finish_output).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:35 -08:00
Patrick McHardy 4e8e9de7c2 [NETFILTER]: Use conntrack information to determine if packet was NATed
Preparation for IPsec support for NAT:
Use conntrack information instead of saving the saving and comparing the
addresses to determine if a packet was NATed and needs to be rerouted to
make it easier to extend the key.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:34 -08:00
Patrick McHardy 3e3850e989 [NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder
ip_route_me_harder doesn't use the port numbers of the xfrm lookup and
uses ip_route_input for non-local addresses which doesn't do a xfrm
lookup, ip6_route_me_harder doesn't do a xfrm lookup at all.

Use xfrm_decode_session and do the lookup manually, make sure both
only do the lookup if the packet hasn't been transformed already.

Makeing sure the lookup only happens once needs a new field in the
IP6CB, which exceeds the size of skb->cb. The size of skb->cb is
increased to 48b. Apparently the IPv6 mobile extensions need some
more room anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:33 -08:00
Patrick McHardy 8cdfab8a43 [IPV4]: reset IPCB flags when neccessary
Reset IPSKB_XFRM_TUNNEL_SIZE flags in ipip and ip_gre hard_start_xmit
function before the packet reenters IP. This is neccessary so the
encapsulated packets are checked not to be oversized in xfrm4_output.c
again. Reset all flags in sit when a packet changes its address family.

Also remove some obsolete IPSKB flags.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:32 -08:00
Patrick McHardy b05e106698 [IPV4/6]: Netfilter IPsec input hooks
When the innermost transform uses transport mode the decapsulated packet
is not visible to netfilter. Pass the packet through the PRE_ROUTING and
LOCAL_IN hooks again before handing it to upper layer protocols to make
netfilter-visibility symetrical to the output path.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:31 -08:00
Patrick McHardy 951dbc8ac7 [IPV6]: Move nextheader offset to the IP6CB
Move nextheader offset to the IP6CB to make it possible to pass a
packet to ip6_input_finish multiple times and have it skip already
parsed headers. As a nice side effect this gets rid of the manual
hopopts skipping in ip6_input_finish.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:29 -08:00
Patrick McHardy 16a6677fdf [XFRM]: Netfilter IPsec output hooks
Call netfilter hooks before IPsec transforms. Packets visit the
FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation
and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode
transform.

Patch from Herbert Xu <herbert@gondor.apana.org.au>:

Move the loop from dst_output into xfrm4_output/xfrm6_output since they're
the only ones who need to it. xfrm{4,6}_output_one() processes the first SA
all subsequent transport mode SAs and is called in a loop that calls the
netfilter hooks between each two calls.

In order to avoid the tail call issue, I've added the inline function
nf_hook which is nf_hook_slow plus the empty list check.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:28 -08:00
David S. Miller aa0e4e4aea [DCCP]: ipv6.c needs net/ip6_checksum.c
Reported by Dave Jones.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:26 -08:00
Linus Torvalds d8d8f6a4fd Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2006-01-06 15:24:28 -08:00
Linus Torvalds 57d1c91fa6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild 2006-01-06 15:23:56 -08:00
Alexey Dobriyan a2167dc62e [NET]: Endian-annotate in_aton()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:24:54 -08:00
Alexey Dobriyan 76ab608d86 [NET]: Endian-annotate struct iphdr
And fix trivial warnings that emerged.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:24:29 -08:00
Trent Jaeger 5f8ac64b15 [LSM-IPSec]: Corrections to LSM-IPSec Nethooks
This patch contains two corrections to the LSM-IPsec Nethooks patches
previously applied.  

(1) free a security context on a failed insert via xfrm_user 
interface in xfrm_add_policy.  Memory leak.

(2) change the authorization of the allocation of a security context
in a xfrm_policy or xfrm_state from both relabelfrom and relabelto 
to setcontext.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:22:39 -08:00
Luiz Capitulino 69549ddd2f [PKTGEN]: Adds missing __init.
pktgen_find_thread() and pktgen_create_thread() are only called at
initialization time.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:19:31 -08:00
Joe 3cbc4ab58f [NETFILTER]: ipt_helper.c needs linux/interrupt.h
From: Joe <joecool1029@gmail.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:15:11 -08:00
Stephen Hemminger ee02b3a613 [BRIDGE] netfilter: vlan + hw checksum = bug?
It looks like the bridge netfilter code does not correctly update
the hardware checksum after popping off the VLAN header.

This is by inspection, I have *not* tested this.
To test you would need to set up a filtering bridge with vlans
and a device the does hardware receive checksum (skge, or sungem)

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:13:29 -08:00
Shaun Pereira a20a855479 [X25]: Fix for broken x25 module.
When a user-space server application calls bind on a socket, then in kernel
space this bound socket is considered 'x25-linked' and the SOCK_ZAPPED flag
is unset.(As in x25_bind()/af_x25.c).

Now when a user-space client application attempts to connect to the server
on the listening socket, if the kernel accepts this in-coming call, then it
returns a new socket to userland and attempts to reply to the caller.

The reply/x25_sendmsg() will fail, because the new socket created on
call-accept has its SOCK_ZAPPED flag set by x25_make_new().
(sock_init_data() called by x25_alloc_socket() called by x25_make_new()
sets the flag to SOCK_ZAPPED)).

Fix: Using the sock_copy_flag() routine available in sock.h fixes this.

Tested on 32 and 64 bit kernels with x25 over tcp.

Signed-off-by: Shaun Pereira <pereira.shaun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:11:35 -08:00
Kris Katterjohn 4bad4dc919 [NET]: Change sk_run_filter()'s return type in net/core/filter.c
It should return an unsigned value, and fix sk_filter() as well.

Signed-off-by: Kris Katterjohn <kjak@ispwest.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:08:20 -08:00
Kris Katterjohn dbbc098828 [NET]: Use newer is_multicast_ether_addr() in some files
This uses is_multicast_ether_addr() because it has recently been
changed to do the same thing these seperate tests are doing.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:05:58 -08:00
Sam Ravnborg 367cb70421 kbuild: un-stringnify KBUILD_MODNAME
Now when kbuild passes KBUILD_MODNAME with "" do not __stringify it when
used. Remove __stringnify for all users.
This also fixes the output of:

$ ls -l /sys/module/
drwxr-xr-x 4 root root 0 2006-01-05 14:24 pcmcia
drwxr-xr-x 4 root root 0 2006-01-05 14:24 pcmcia_core
drwxr-xr-x 3 root root 0 2006-01-05 14:24 "processor"
drwxr-xr-x 3 root root 0 2006-01-05 14:24 "psmouse"

The quoting of the module names will be gone again.
Thanks to GregKH + Kay Sievers for reproting this.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-01-06 21:17:50 +01:00
J. Bruce Fields 9e56904e41 SUNRPC: Make krb5 report unsupported encryption types
Print messages when an unsupported encrytion algorthm is requested or
 there is an error locating a supported algorthm.

 Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:59:00 -05:00
J. Bruce Fields 42181d4baf SUNRPC: Make spkm3 report unsupported encryption types
Print messages when an unsupported encrytion algorthm is requested or
 there is an error locating a supported algorthm.

 Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:59 -05:00
J. Bruce Fields 9eed129bbd SUNRPC: Update the spkm3 code to use the make_checksum interface
Also update the tokenlen calculations to accomodate g_token_size().

 Signed-off-by: Andy Adamson <andros@citi.umich.edu>
 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:59 -05:00
Trond Myklebust 0065db3285 SUNRPC: Clean up xprt_destroy()
We ought never to be calling xprt_destroy() if there are still active
 rpc_tasks. Optimise away the broken code that attempts to "fix" that case.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:58 -05:00
Trond Myklebust 632e3bdc50 SUNRPC: Ensure client closes the socket when server initiates a close
If the server decides to close the RPC socket, we currently don't actually
 respond until either another RPC call is scheduled, or until xprt_autoclose()
 gets called by the socket expiry timer (which may be up to 5 minutes
 later).

 This patch ensures that xprt_autoclose() is called much sooner if the
 server closes the socket.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:57 -05:00
Chuck Lever f518e35aec SUNRPC: get rid of cl_chatty
Clean up: Every ULP that uses the in-kernel RPC client, except the NLM
 client, sets cl_chatty.  There's no reason why NLM shouldn't set it, so
 just get rid of cl_chatty and always be verbose.

 Test-plan:
 Compile with CONFIG_NFS enabled.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:56 -05:00
Chuck Lever 922004120b SUNRPC: transport switch API for setting port number
At some point, transport endpoint addresses will no longer be IPv4.  To hide
 the structure of the rpc_xprt's address field from ULPs and port mappers,
 add an API for setting the port number during an RPC bind operation.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.  NFSv2/3 and NFSv4 mounting should be carefully checked.
 Probably need to rig a server where certain services aren't running, or
 that returns an error for some typical operation.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:56 -05:00
Chuck Lever 35f5a422ce SUNRPC: new interface to force an RPC rebind
We'd like to hide fields in rpc_xprt and rpc_clnt from upper layer protocols.
 Start by creating an API to force RPC rebind, replacing logic that simply
 sets cl_port to zero.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.  NFSv2/3 and NFSv4 mounting should be carefully checked.
 Probably need to rig a server where certain services aren't running, or
 that returns an error for some typical operation.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:56 -05:00
Chuck Lever 0210714834 SUNRPC: switchable buffer allocation
Add RPC client transport switch support for replacing buffer management
 on a per-transport basis.

 In the current IPv4 socket transport implementation, RPC buffers are
 allocated as needed for each RPC message that is sent.  Some transport
 implementations may choose to use pre-allocated buffers for encoding,
 sending, receiving, and unmarshalling RPC messages, however.  For
 transports capable of direct data placement, the buffers can be carved
 out of a pre-registered area of memory rather than from a slab cache.

 Test-plan:
 Millions of fsx operations.  Performance characterization with "sio" and
 "iozone".  Use oprofile and other tools to look for significant regression
 in CPU utilization.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:55 -05:00
Adrian Bunk fb459f45f7 SUNRPC: net/sunrpc/xdr.c: remove xdr_decode_string()
This patch removes ths unused function xdr_decode_string().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Neil Brown <neilb@suse.de>
Acked-by: Charles Lever <Charles.Lever@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:53 -05:00
Trond Myklebust 969b7f2522 SUNRPC: Fix a potential race in rpc_pipefs.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:51 -05:00
Trond Myklebust 2bd615797e SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.
...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to
 interrupt RPC calls too (as per the Solaris implementation).

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:45 -05:00
Trond Myklebust e60859ac0e SUNRPC: rpc_execute should not return task->tk_status;
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:42 -05:00
Trond Myklebust 89991c24e4 SUNRPC: Get rid of some unused exports
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:41 -05:00
Trond Myklebust 44c288732f NFSv4: stateful NFSv4 RPC call interface
The NFSv4 model requires us to complete all RPC calls that might
 establish state on the server whether or not the user wants to
 interrupt it. We may also need to schedule new work (including
 new RPC calls) in order to cancel the new state.

 The asynchronous RPC model will allow us to ensure that RPC calls
 always complete, but in order to allow for "synchronous" RPC, we
 want to add the ability to wait for completion.
 The waits are, of course, interruptible.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust 4ce70ada1f SUNRPC: Further cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust 963d8fe533 RPC: Clean up RPC task structure
Shrink the RPC task structure. Instead of storing separate pointers
 for task->tk_exit and task->tk_release, put them in a structure.

 Also pass the user data pointer as a parameter instead of passing it via
 task->tk_calldata. This enables us to nest callbacks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:39 -05:00
Trond Myklebust abbcf28f23 SUNRPC: Yet more RPC cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:39 -05:00
Olaf Kirch 93fbf1a5de [PATCH] Keep nfsd from exiting when seeing recv() errors
I submitted this one previously - svc_tcp_recvfrom currently returns
any errors to the caller, including ECONNRESET and the like.

This is something svc_recv isn't able to deal with:

	len = svsk->sk_recvfrom(rqstp);
	[...]
	if (len == 0 || len == -EAGAIN) {
		[...]
		return -EAGAIN;
	}

	[...]
	return len;

The nfsd main loop will exit when it sees an error code other than
EAGAIN.

The following patch fixes this problem

svc_recv is not equipped to deal with error codes other than EAGAIN,
and will propagate anything else (such as ECONNRESET) up to nfsd,
causing it to exit.

Signed-off-by: Olaf Kirch <okir@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:59 -08:00
NeilBrown 1f1e030bf7 [PATCH] knfsd: fix hash function for IP addresses on 64bit little-endian machines.
The hash.h hash_long function, when used on a 64 bit machine, ignores many
of the middle-order bits.  (The prime chosen it too bit-sparse).

IP addresses for clients of an NFS server are very likely to differ only in
the low-order bits.  As addresses are stored in network-byte-order, these
bits become middle-order bits in a little-endian 64bit 'long', and so do
not contribute to the hash.  Thus you can have the situation where all
clients appear on one hash chain.

So, until hash_long is fixed (or maybe forever), us a hash function that
works well on IP addresses - xor the bytes together.

Thanks to "Iozone" <capps@iozone.org> for identifying this problem.

Cc: "Iozone" <capps@iozone.org>

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Kris Katterjohn 46f25dffba [NET]: Change 1500 to ETH_DATA_LEN in some files
These patches add the header linux/if_ether.h and change 1500 to
ETH_DATA_LEN in some files.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 16:48:56 -08:00
Andrew Morton e924283bf9 [IPVS]: Another file needs linux/interrupt.h
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 16:48:55 -08:00
Yasuyuki Kozakai e8eaedf2f8 [NETFILTER]: Use HOPLIMIT metric as TTL of TCP reset sent by REJECT
HOPLIMIT metric is appropriate to TCP reset sent by REJECT target
than hard-coded max TTL. Thanks to David S. Miller for hint.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:28:57 -08:00
Patrick McHardy 0ae2cfe7f3 [NETFILTER]: nf_conntrack_l3proto_ipv4.c needs net/route.h
CC [M]  net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c: In function 'ipv4_refrag':
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:198: error: dereferencing pointer to incomplete type
make[3]: *** [net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o] Error 1

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:21:52 -08:00
Patrick McHardy 22dea562bb [NETFILTER]: Export ip6_masked_addrcmp, don't pass IPv6 addresses on stack
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:21:34 -08:00
Patrick McHardy b777e0ce74 [NETFILTER]: make ipv6_find_hdr() find transport protocol header
The original ipv6_find_hdr() finds the specified header in IPv6 packets.
This makes it possible to get transport header so that we can kill similar
loop in ip6_match_packet().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:21:16 -08:00
Patrick McHardy 1bd9bef6f9 [NETFILTER]: Call POST_ROUTING hook before fragmentation
Call POST_ROUTING hook before fragmentation to get rid of the okfn use
in ip_refrag and save the useless fragmentation/defragmentation step
when NAT is used.

The patch introduces one user-visible change, the POSTROUTING chain
in the mangle table gets entire packets, not fragments, which should
simplify use of the MARK and CLASSIFY targets for queueing as a nice
side-effect.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:20:59 -08:00
Patrick McHardy abbcc73982 [NETFILTER]: Remove okfn usage in ip_vs_core.c
okfn should only be used from different contexts to avoid deep call chains,
i.e. by nf_queue.

Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:20:40 -08:00
Patrick McHardy a9b305c4e5 [NETFILTER]: ctnetlink: Fix dumping of helper name
Properly dump the helper name instead of internal kernel data.
Based on patch by Marcus Sundberg <marcus@ingate.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:20:02 -08:00
Patrick McHardy e7be6994ec [NETFILTER]: Fix module_param types and permissions
Fix netfilter module_param types and permissions. Also fix an off-by-one in
the ipt_ULOG nlbufsiz < 128k check.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:19:46 -08:00
Pablo Neira Ayuso 87711cb81c [NETFILTER]: Filter dumped entries based on the layer 3 protocol number
Dump entries of a given Layer 3 protocol number.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:19:23 -08:00
Pablo Neira Ayuso c1d10adb4a [NETFILTER]: Add ctnetlink port for nf_conntrack
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:19:05 -08:00
Pablo Neira Ayuso 205d67c7d9 [NETFILTER]: ctnetlink: remove unused variable
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:18:44 -08:00
Pablo Neira Ayuso d4d6bb41e0 [NETFILTER]: ctnetlink: fix conntrack mark race
Set conntrack mark before it is in hashes.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:18:25 -08:00
Pablo Neira Ayuso 0368309cb4 [NETFILTER]: ctnetlink: ctnetlink_event cleanup
Cleanup: Use 'else if' instead of a ugly 'goto' statement.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:18:08 -08:00
Pablo Neira Ayuso 47116eb201 [NETFILTER]: ctnetlink: use u_int32_t instead of unsigned int
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:17:50 -08:00
Pablo Neira Ayuso 984955b3d7 [NETFILTER]: ctnetlink: propagate ctnetlink_dump_tuples_proto return value back
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:17:29 -08:00
Yasuyuki Kozakai 90c4656eb4 [NETFILTER]: ctnetlink: Add sanity checkings for ICMP
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:17:03 -08:00
Pablo Neira Ayuso 684f7b296c [NETFILTER]: ctnetlink: remove bogus checks in ICMP protocol at dumping
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:16:41 -08:00
Jesper Juhl d695aa8a1f [NETFILTER]: Decrease number of pointer derefs in nf_conntrack_core.c
Benefits of the patch:
 - Fewer pointer dereferences should make the code slightly faster.
 - Size of generated code is smaller
 - improved readability

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:16:16 -08:00
Jesper Juhl 3e4ead4fe5 [NETFILTER]: Decrease number of pointer derefs in nfnetlink_queue.c
Benefits of the patch:
 - Fewer pointer dereferences should make the code slightly faster.
 - Size of generated code is smaller
 - improved readability

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:15:58 -08:00
Adrian Bunk 4ffd2e4907 [IPVS]: Fix compilation
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:14:43 -08:00
Linus Torvalds db9edfd7e3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
Trivial manual merge fixup for usb_find_interface clashes.
2006-01-04 18:44:12 -08:00
Linus Torvalds 52347f4e81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2006-01-04 16:34:57 -08:00
Linus Torvalds d779188d2b Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-01-04 16:31:56 -08:00
Kay Sievers fd586bacf4 [PATCH] net: swich device attribute creation to default attrs
Recent udev versions don't longer cover bad sysfs timing with built-in
logic. Explicit rules are required to do that. For net devices, the
following is needed:
  ACTION=="add", SUBSYSTEM=="net", WAIT_FOR_SYSFS="address"
to handle access to net device properties from an event handler without
races.

This patch changes the main net attributes to be created by the driver
core, which is done _before_ the event is sent out and will not require
the stat() loop of the WAIT_FOR_SYSFS key.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 16:18:10 -08:00
Kay Sievers 312c004d36 [PATCH] driver core: replace "hotplug" by "uevent"
Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 16:18:08 -08:00
Thomas Young 74cb879822 [TCP] tcp_vegas: Fix slow start
Vegas' slow start was only adding one MSS per RTT rather than one for
every ack. Slow start behavior should now match Reno.

Signed-off-by: Thomas Young <tyo@ee.mu.oz.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:59:32 -08:00
Kris Katterjohn 9369986306 [NET]: More instruction checks fornet/core/filter.c
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:58:36 -08:00
YOSHIFUJI Hideaki 181a46a56e [NETFILTER]: Use macro for spinlock_t/rwlock_t initializations/definition.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:56:54 -08:00
YOSHIFUJI Hideaki 196433c5b7 [IPV6]: Use macro for rwlock_t initialization.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:56:31 -08:00
YOSHIFUJI Hideaki ca40330248 [ECONET]: Use macro for spinlock_t definition.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:56:08 -08:00
Arnaldo Carvalho de Melo f190055ff5 [IPVS]: Add missing include <linux/net.h>
CC [M]  net/ipv4/ipvs/ip_vs_conn.o
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c: In
  function 'ip_vs_conn_new':
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c:606:
  warning: implicit declaration of function 'net_ratelimit'
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c: In
  function 'ip_vs_random_dropentry':
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c:810:
  warning: implicit declaration of function 'net_random'

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 02:02:20 -02:00
Arnaldo Carvalho de Melo 80e40daa47 [TCP]: syn_flood_warning is only needed if CONFIG_SYN_COOKIES is selected
CC      net/ipv4/tcp_ipv4.o
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/tcp_ipv4.c:665: warning:
  'syn_flood_warning' defined but not used

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 01:58:06 -02:00
Arnaldo Carvalho de Melo e4dfd449c8 [DCCP] ackvec: use u8 for the buf offsets
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 01:46:34 -02:00
Andrea Bittau 6742bbcbb8 [DCCP] ackvec: Fix spelling of "throw"
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 01:45:17 -02:00
Stephen Hemminger 40efc6fa17 [TCP]: less inline's
TCP inline usage cleanup:
 * get rid of inline in several places
 * replace __inline__ with inline where possible
 * move functions used in one file out of tcp.h
 * let compiler decide on used once cases

On x86_64: 
   text	   data	    bss	    dec	    hex	filename
3594701	 648348	 567400	4810449	 4966d1	vmlinux.orig
3593133	 648580	 567400	4809113	 496199	vmlinux

On sparc64:
   text	   data	    bss	    dec	    hex	filename
2538278	 406152	 530392	3474822	 350586	vmlinux.ORIG
2536382	 406384	 530392	3473158	 34ff06	vmlinux

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 16:03:49 -08:00
Stephen Hemminger 3c19065a1e [IEEE80211] ipw2200: Simplify multicast checks.
From: Stephen Hemminger <shemminger@osdl.org>

is_multicast_ether_addr() accepts broadcast too, so the
is_broadcast_ether_addr() calls are redundant.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 15:27:38 -08:00
Stephen Hemminger cd8787ab04 [IPV4] fib_trie: build fix
Need this to fix build of fib_trie in net-2.6.16 (rebased) tree.
The code needs the new inet_make_mask inline.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:38:34 -08:00
Stephen Hemminger 554c9a8ec3 [BRIDGE]: Fix faulty check in br_stp_recalculate_bridge_id()
One of the conversions from memcmp to compare_ether_addr is incorrect.
We need to do relative comparison to determine min MAC address to
use in bridge id. 

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:35:54 -08:00
Andrea Bittau e84a9f5e9c [DCCP]: Notify CCID only after ACK vectors have been processed.
The CCID should be notified of packet reception only when a packet is
valid.  Therefore, the ACK vector needs to be processed before
notifying the CCID.  Also, the CCID might need information provided by
the ACK vector.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:26:15 -08:00
Andrea Bittau 9e377202d2 [DCCP]: Send an ACK vector when ACKing a response packet
If ACK vectors are used, each packet with an ACK should contain an ACK
vector.  The only exception currently is response packets.  It
probably is not a good idea to store ACK vector state before the
connection is completed (to help protect from syn floods).

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:25:49 -08:00
Andrea Bittau 709dd3aaf5 [DCCP]: Do not process a packet twice when it's not in state DCCP_OPEN.
When packets are received, the connection is either in DCCP_OPEN
[fast-path] or it isn't.  If it's not [e.g. DCCP_PARTOPEN] upper
layers will perform sanity checks and parse options.  If it is in
DCCP_OPEN, dccp_rcv_established() will do it.  It is important not to
re-parse options in dccp_rcv_established() when it is not called from
the fast-path.  Else, fore example, the ack vector will be added twice
and the CCID will see the packet twice.

The solution is to always enfore sanity checks from the upper layers.
When packets arrive in the fast-path, sanity checks will be performed
before calling dccp_rcv_established().

Note(acme): I rewrote the patch to achieve the same result but keeping
dccp_rcv_established with the previous semantics and having it split
into __dccp_rcv_established, that doesn't does do any sanity check,
code in state != DCCP_OPEN use this lighter version as they already do
the sanity checks.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:25:17 -08:00
Patrick Caulfield 5062430c5c [DECNET]: Only use local routers
The attached patch makes DECnet routing only use routers from the same
area - rather than the highest rated router seen.

In theory there should not be an out-of-area router on a local network
but some networks are bridged rather than properly routed. VMS seems
to behave similarly: if I bring up a VMS node with no router then it
can't see anything else on the global network.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:24:02 -08:00
Roberto Nibali 4b5bdf5cc3 [IPVS]: Cleanup IP_VS_DBG statements.
From: Roberto Nibali <ratz@drugphish.ch>

The attached patch (against current -GIT) is a cleanup patch which does
following:

o lookup debug messages shifted back to 9
o added more informational value to flags and refcnt since those
entries can be in multiple referenced structures
o cleanup 80 char violation

It's the prepatch to the session pool implementation and helps very much
to debug and monitor important variables and structures regarding the
threshold limitation and persistency without the thousands of lookup
messages which noone is interested in.

Signed-off-by: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:22:59 -08:00
Christoph Hellwig b5e5fa5e09 [NET]: Add a dev_ioctl() fallback to sock_ioctl()
Currently all network protocols need to call dev_ioctl as the default
fallback in their ioctl implementations.  This patch adds a fallback
to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
This way all the procotol ioctl handlers can be simplified and we don't
need to export dev_ioctl.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:18:33 -08:00
Christoph Hellwig 5ff7630e4a [NETROM]: Remove unessecary lock_sock calls in netrom_ioctl()
lock_sock is needed only in very few cases, so do it there instead of
around the switch statement.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:14:46 -08:00
Per Liden b461d2f218 [NETLINK] genetlink: fix cmd type in genl_ops to be consistent to u8
Signed-off-by: Per Liden <per.liden@ericsson.com>
ACKed-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:13:29 -08:00
Benjamin LaHaise fd19f329a3 [AF_UNIX]: Convert to use a spinlock instead of rwlock
From: Benjamin LaHaise <bcrl@kvack.org>

In af_unix, a rwlock is used to protect internal state.  At least on my 
P4 with HT it is faster to use a spinlock due to the simpler memory 
barrier used to unlock.  This patch raises bw_unix to ~690K/s.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:10:46 -08:00
Benjamin LaHaise 4947d3ef8d [NET]: Speed up __alloc_skb()
From: Benjamin LaHaise <bcrl@kvack.org>

In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the 
shared info structure results in gcc being forced to do a reload of the 
pointer since it has no information on possible aliasing.  Fix this by 
using a pointer to refer to skb_shared_info.

By initializing skb_shared_info sequentially, the write combining buffers 
can reduce the number of memory transactions to a single write.  Reorder 
the initialization in __alloc_skb() to match the structure definition.  
There is also an alignment issue on 64 bit systems with skb_shared_info 
by converting nr_frags to a short everything packs up nicely.

Also, pass the slab cache pointer according to the fclone flag instead 
of using two almost identical function calls.

This raises bw_unix performance up to a peak of 707KB/s when combined 
with the spinlock patch.  It should help other networking protocols, too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:06:50 -08:00
Arnaldo Carvalho de Melo 14c850212e [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:21 -08:00
Arnaldo Carvalho de Melo 25995ff577 [SOCK]: Introduce sk_receive_skb
Its common enough to to justify that, TCP still can't use it as it has the
prequeueing stuff, still to be made generic in the not so distant future :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:19 -08:00
Christoph Hellwig ce1d4d3e88 [NET]: restructure sock_aio_{read,write} / sock_{readv,writev}
Mid-term I plan to restructure the file_operations so that we don't need
to have all these duplicate aio and vectored versions.  This patch is
a small step in that direction but also a worthwile cleanup on it's own:

(1) introduce a alloc_sock_iocb helper that encapsulates allocating a
    proper sock_iocb
(2) add do_sock_read and do_sock_write helpers for common read/write
    code

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:18 -08:00
David S. Miller cbeb321a64 [NET]: Fix sock_init() return value.
It needs to return zero now that it is an initcall.

Also, net/nonet.c no longer needs a dummy sock_init().

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:17 -08:00
Jaco Kroon f34fbb9713 [PKTGEN]: Deinitialise static variables.
static variables should not be explicitly initialised to 0.  This causes
them to be placed in .data instead of .bss.  This patch de-initialises 3
static variables in net/core/pktgen.c.

There are approximately 800 more such variables in the source tree
(2.6.15rc5).  If there is more interrest I'd be willing to track down the
rest of these as well and de-initialise them as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:16 -08:00
Eric Dumazet 90ddc4f047 [NET]: move struct proto_ops to const
I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)

This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.

This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)

I made a global stubstitute to change all existing occurences to make
them const.

This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:15 -08:00
Andi Kleen 77d76ea310 [NET]: Small cleanup to socket initialization
sock_init can be done as a core_initcall instead of calling
it directly in init/main.c

Also I removed an out of date #ifdef.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:14 -08:00
Frank Filz 7708610b1b [SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.
Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:13 -08:00
Frank Filz 52ccb8e90c [SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.
This patch adds support to set/get heartbeat interval, maximum number of
retransmissions, pathmtu, sackdelay time for a particular transport/
association/socket as per the latest SCTP sockets api draft11.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:11 -08:00
Robert Olsson fd9662555c [IPV4] fib_trie: Add credits.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:10 -08:00
Stephen Hemminger 9eb2d62719 [TCP] cubic: use Newton-Raphson
Replace cube root algorithim with a faster version using Newton-Raphson.
Surprisingly, doing the scaled div64_64 is faster than a true 64 bit
division on 64 bit CPU's.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:09 -08:00
Stephen Hemminger 89b3d9aaf4 [TCP] cubic: precompute constants
Revised version of patch to pre-compute values for TCP cubic.
  * d32,d64 replaced with descriptive names
  * cube_factor replaces
	 srtt[scaled by count] / HZ * ((1 << (10+2*BICTCP_HZ)) / bic_scale)
  * beta_scale replaces
	8*(BICTCP_BETA_SCALE+beta)/3/(BICTCP_BETA_SCALE-beta);

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:08 -08:00
Stephen Hemminger c865e5d99e [PKT_SCHED] netem: packet corruption option
Here is a new feature for netem in 2.6.16. It adds the ability to
randomly corrupt packets with netem. A version was done by
Hagen Paul Pfeifer, but I redid it to handle the cases of backwards
compatibility with netlink interface and presence of hardware checksum
offload. It is useful for testing hardware offload in devices.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:05 -08:00
Stephen Hemminger 8cbb512e50 [BRIDGE]: add version number
Add version info to bridge module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:04 -08:00
Stephen Hemminger edb5e46fc0 [BRIDGE]: limited ethtool support
Add limited ethtool support to bridge to allow disabling
features.

Note: if underlying device does not support a feature (like checksum
offload), then the bridge device won't inherit it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:03 -08:00
Stephen Hemminger 0e5eabac49 [BRIDGE]: filter packets in learning state
While in the learning state, run filters but drop the result.
This prevents us from acquiring bad fdb entries in learning state.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:02 -08:00
Stephen Hemminger 4433f420e5 [BRIDGE]: handle speed detection after carrier changes
Speed of a interface may not be available until carrier
is detected in the case of autonegotiation. To get the correct value
we need to recheck speed after carrier event.  But the check needs to
be done in a context that is similar to normal ethtool interface (can sleep).

Also, delay check for 1ms to try avoid any carrier bounce transitions.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:01 -08:00
Stephen Hemminger 4505a3ef72 [BRIDGE]: allow setting hardware address of bridge pseudo-dev
Some people are using bridging to hide multiple machines from an ISP
that restricts by MAC address. So in that case allow the bridge mac
address to be set to any of the existing interfaces.  I don't want to
allow any arbitrary value and confuse STP.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:00 -08:00
David S. Miller fbe9cc4a87 [AF_UNIX]: Use spinlock for unix_table_lock
This lock is actually taken mostly as a writer,
so using a rwlock actually just makes performance
worse especially on chips like the Intel P4.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:59 -08:00
Arnaldo Carvalho de Melo d83d8461f9 [IP_SOCKGLUE]: Remove most of the tcp specific calls
As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo d8313f5ca2 [INET6]: Generalise tcp_v6_hash_connect
Renaming it to inet6_hash_connect, making it possible to ditch
dccp_v6_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo a7f5e7f164 [INET]: Generalise tcp_v4_hash_connect
Renaming it to inet_hash_connect, making it possible to ditch
dccp_v4_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:55 -08:00
Arnaldo Carvalho de Melo 6d6ee43e0b [TWSK]: Introduce struct timewait_sock_ops
So that we can share several timewait sockets related functions and
make the timewait mini sockets infrastructure closer to the request
mini sockets one.

Next changesets will take advantage of this, moving more code out of
TCP and DCCP v4 and v6 to common infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo fc44b98053 [DCCP]: Use reqsk_free in dccp_v4_conn_request
Now we have the destructor (dccp_v4_reqsk_destructor) in our
request_sock_ops vtable.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:53 -08:00
Arnaldo Carvalho de Melo 3df80d9320 [DCCP]: Introduce DCCPv6
Still needs mucho polishing, specially in the checksum code, but works
just fine, inet_diag/iproute2 and all 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:52 -08:00
Arnaldo Carvalho de Melo 399c07def6 [IPV6]: Export ipv6_opt_accepted
It was already non-TCP specific, will be used by DCCPv6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:51 -08:00
Arnaldo Carvalho de Melo f21e68caa0 [DCCP]: Prepare the AF agnostic core for the introduction of DCCPv6
Basically exports a similar set of functions as the one exported by
the non-AF specific TCP code.

In the process moved some non-AF specific code from dccp_v4_connect to
dccp_connect_init and moved the checksum verification from
dccp_invalid_packet to dccp_v4_rcv, so as to use it in dccp_v6_rcv
too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:50 -08:00
Arnaldo Carvalho de Melo 34ca686081 [DCCP]: Just rename dccp_v4_prot to dccp_prot
To match TCP equivalent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:49 -08:00
Arnaldo Carvalho de Melo 3cf3dc6c2e [IPV6]: Export some symbols for DCCPv6
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:48 -08:00
Arnaldo Carvalho de Melo 0fa1a53e1f [IPV6]: Introduce inet6_timewait_sock
Out of tcp6_timewait_sock, that now is just an aggregation of
inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct
inet_timewait_sock, that is common to the IPv6 transport protocols that use
timewait sockets, like DCCP and TCP.

tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic
code to find the IPv6 area in a timewait sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:47 -08:00
Arnaldo Carvalho de Melo b9750ce13c [IPV6]: Generalise some functions
Using sk->sk_protocol instead of IPPROTO_TCP.

Will be used by DCCPv6 in the next changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:46 -08:00
Benjamin LaHaise 830a1e5c21 [AF_UNIX]: Remove superfluous reference counting in unix_stream_sendmsg
AF_UNIX stream socket performance on P4 CPUs tends to suffer due to a
lot of pipeline flushes from atomic operations.  The patch below
removes the sock_hold() and sock_put() in unix_stream_sendmsg().  This
should be safe as the socket still holds a reference to its peer which
is only released after the file descriptor's final user invokes
unix_release_sock().  The only consideration is that we must add a
memory barrier before setting the peer initially.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:45 -08:00
Benjamin LaHaise c1cbe4b7ad [NET]: Avoid atomic xchg() for non-error case
It also looks like there were 2 places where the test on sk_err was
missing from the event wait logic (in sk_stream_wait_connect and
sk_stream_wait_memory), while the rest of the sock_error() users look
to be doing the right thing.  This version of the patch fixes those,
and cleans up a few places that were testing ->sk_err directly.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:44 -08:00
Roberto Nibali f1f71e03b1 [IPVS]: remove dead code
This patch removes dead code. I don't see the reason to keep this cruft
around, besides cluttering the nice and functionally working code.

Signed-off-by: Roberto Nibali <ratz@drugphish.ch>
Signed-off-by: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:43 -08:00
Stephen Hemminger 65a45441d7 [UDP]: udp_checksum_init return value
Since udp_checksum_init always returns 0 there is no point in
having it return a value.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:42 -08:00
Herbert Xu 3305b80c21 [IP]: Simplify and consolidate MSG_PEEK error handling
When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
it is left on the socket receive queue.  This means that when we detect
a checksum error we have to be careful when trying to free the packet
as someone could have dequeued it in the time being.

Currently this delicate logic is duplicated three times between UDPv4,
UDPv6 and RAWv6.  This patch moves them into a one place and simplifies
the code somewhat.

This is based on a suggestion by Eric Dumazet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:41 -08:00
Arnaldo Carvalho de Melo 57cca05af1 [DCCP]: Introduce dccp_ipv4_af_ops
And make the core DCCP code AF agnostic, just like TCP, now its time
to work on net/dccp/ipv6.c, we are close to the end!

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:40 -08:00
Arnaldo Carvalho de Melo af05dc9394 [ICSK]: Move v4_addr2sockaddr from TCP to icsk
Renaming it to inet_csk_addr2sockaddr.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:39 -08:00
Arnaldo Carvalho de Melo 8292a17a39 [ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops
And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:38 -08:00
Arnaldo Carvalho de Melo ca304b6104 [IPV6]: Introduce inet6_rsk()
And inet6_rsk_offset in inet_request_sock, for the same reasons as
inet_sock's pinfo6 member.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:37 -08:00
Arnaldo Carvalho de Melo 8129765ac0 [IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add
More work is needed tho to introduce inet6_request_sock from
tcp6_request_sock, in the same layout considerations as ipv6_pinfo in
inet_sock, next changeset will do that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:36 -08:00
Arnaldo Carvalho de Melo c2977c2213 [ICSK]: make inet_csk_reqsk_queue_hash_add timeout arg unsigned long
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:34 -08:00
Arnaldo Carvalho de Melo 90b19d3169 [IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:33 -08:00
Arnaldo Carvalho de Melo 971af18bbf [IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:33 -08:00
Herbert Xu 89cee8b1cb [IPV4]: Safer reassembly
Another spin of Herbert Xu's "safer ip reassembly" patch
for 2.6.16.

(The original patch is here:
http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2
and my only contribution is to have tested it.)

This patch (optionally) does additional checks before accepting IP
fragments, which can greatly reduce the possibility of reassembling
fragments which originated from different IP datagrams.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:31 -08:00
Bart De Schuymer d5228a4f49 [NETFILTER] ebtables: Support nf_log API from ebt_log and ebt_ulog
This makes ebt_log and ebt_ulog use the new nf_log api.  This enables
the bridging packet filter to log packets e.g. via nfnetlink_log.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:30 -08:00
Eric Dumazet 3183606469 [NETFILTER] ip_tables: NUMA-aware allocation
Part of a performance problem with ip_tables is that memory allocation
is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch
separate cache lines)

Even with small iptables rules, the cost of this misplacement can be
high on common workloads.  Instead of using one vmalloc() area
(located in the node of the iptables process), we now allocate an area
for each possible CPU, using vmalloc_node() so that memory should be
allocated in the CPU's node if possible.

Port to arp_tables and ip6_tables by Harald Welte.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:29 -08:00
Stephen Hemminger df3271f336 [TCP] BIC: CUBIC window growth (2.0)
Replace existing BIC version 1.1 with new version 2.0.
The main change is to replace the window growth function
with a cubic function as described in:
  http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/cubic-paper.pdf

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:28 -08:00
Stephen Hemminger 05d054503a [TCP] BIC: spelling and whitespace
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:27 -08:00
Stephen Hemminger 018da8f44c [TCP] BIC: remove low utilization code.
The latest BICTCP patch at:
http://www.csc.ncsu.edu:8080/faculty/rhee/export/bitcp/index_files/Page546.htm

disables the low_utilization feature of BICTCP because it doesn't work
in some cases. This patch removes it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:26 -08:00
Trent Jaeger df71837d50 [LSM-IPSec]: Security association restriction.
This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association.  Such access
controls augment the existing ones based on network interface and IP
address.  The former are very coarse-grained, and the latter can be
spoofed.  By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools).  This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal.  The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools.  ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts.  These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface.  Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:24 -08:00
Jeff Garzik ac67c62473 Merge branch 'master' 2006-01-03 10:49:18 -05:00
Matt Mackall 4a4efbdee2 s/retreiv/retriev/g
As everyone knows, the rule is: "i before e.. um.. always."

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-03 13:27:11 +01:00
David L Stevens 5ab4a6c81e [IPV6] mcast: Fix multiple issues in MLDv2 reports.
The below "jumbo" patch fixes the following problems in MLDv2.

1) Add necessary "ntohs" to recent "pskb_may_pull" check [breaks
        all nonzero source queries on little-endian (!)]

2) Add locking to source filter list [resend of prior patch]

3) fix "mld_marksources()" to
        a) send nothing when all queried sources are excluded
        b) send full exclude report when source queried sources are
                not excluded
        c) don't schedule a timer when there's nothing to report

NOTE: RFC 3810 specifies the source list should be saved and each
  source reported individually as an IS_IN. This is an obvious DOS
  path, requiring the host to store and then multicast as many sources
  as are queried (e.g., millions...). This alternative sends a full, 
  relevant report that's limited to number of sources present on the
  machine.

4) fix "add_grec()" to send empty-source records when it should
        The original check doesn't account for a non-empty source
        list with all sources inactive; the new code keeps that
        short-circuit case, and also generates the group header
        with an empty list if needed.

5) fix mca_crcount decrement to be after add_grec(), which needs
        its original value

These issues (other than item #1 ;-) ) were all found by Yan Zheng,
much thanks!

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 14:03:00 -08:00
David S. Miller 1b93ae64ca [NET]: Validate socket filters against BPF_MAXINSNS in one spot.
Currently the checks are scattered all over and this leads
to inconsistencies and even cases where the check is not made.

Based upon a patch from Kris Katterjohn.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 13:57:59 -08:00
YOSHIFUJI Hideaki 6732badee0 [IPV6]: Fix addrconf dead lock.
We need to release idev->lcok before we call addrconf_dad_stop().
It calls ipv6_addr_del(), which will hold idev->lock.

Bug spotted by Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 13:35:15 -08:00
David Kimdon 79cac2a221 [BR_NETFILTER]: Fix leak if skb traverses > 1 bridge
Call nf_bridge_put() before allocating a new nf_bridge structure and
potentially overwriting the pointer to a previously allocated one.
This fixes a memory leak which can occur when the bridge topology
allows for an skb to traverse more than one bridge.

Signed-off-by: David Kimdon <david.kimdon@devicescape.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-26 17:27:10 -08:00
David L Stevens 6f4353d891 [IPV6]: Increase default MLD_MAX_MSF to 64.
The existing default of 10 is just way too low.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-26 17:03:46 -08:00
Hiroyuki YAMAMORI 291d809ba5 [IPV6]: Fix Temporary Address Generation
From: Hiroyuki YAMAMORI <h-yamamo@db3.so-net.ne.jp>

Since regen_count is stored in the public address, we need to reset it
when we start renewing temporary address.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-23 11:24:05 -08:00
YOSHIFUJI Hideaki 3dd3bf8357 [IPV6]: Fix dead lock.
We need to relesae ifp->lock before we call addrconf_dad_stop(),
which will hold ifp->lock.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-23 11:23:21 -08:00
David S. Miller e6469297d4 Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+git+ipv6-fix-20051221a 2005-12-22 07:41:27 -08:00
David S. Miller 9b78a82c1c [IPSEC]: Fix policy updates missed by sockets
The problem is that when new policies are inserted, sockets do not see
the update (but all new route lookups do).

This bug is related to the SA insertion stale route issue solved
recently, and this policy visibility problem can be fixed in a similar
way.

The fix is to flush out the bundles of all policies deeper than the
policy being inserted.  Consider beginning state of "outgoing"
direction policy list:

	policy A --> policy B --> policy C --> policy D

First, realize that inserting a policy into a list only potentially
changes IPSEC routes for that direction.  Therefore we need not bother
considering the policies for other directions.  We need only consider
the existing policies in the list we are doing the inserting.

Consider new policy "B'", inserted after B.

	policy A --> policy B --> policy B' --> policy C --> policy D

Two rules:

1) If policy A or policy B matched before the insertion, they
   appear before B' and thus would still match after inserting
   B'

2) Policy C and D, now "shadowed" and after policy B', potentially
   contain stale routes because policy B' might be selected
   instead of them.

Therefore we only need flush routes assosciated with policies
appearing after a newly inserted policy, if any.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-22 07:39:48 -08:00
Ian McDonald 4c7e689502 [DCCP]: Comment typo
I hope to actually change this behaviour shortly but this will help
anybody grepping code at present.

Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-21 19:02:39 -08:00
Kristian Slavov 1d1428045c [IPV6]: Fix address deletion
If you add more than one IPv6 address belonging to the same prefix and 
delete the address that was last added, routing table entry for that 
prefix is also deleted.
Tested on 2.6.14.4

To reproduce:
ip addr add 3ffe::1/64 dev eth0
ip addr add 3ffe::2/64 dev eth0
/* wait DAD */
sleep 1
ip addr del 3ffe::2/64 dev eth0
ip -6 route

(route to 3ffe::/64 should be gone)

In ipv6_del_addr(), if ifa == ifp, we set ifa->if_next to NULL, and later 
assign ifap = &ifa->if_next, effectively terminating the for-loop.
This prevents us from checking if there are other addresses using the same 
prefix that are valid, and thus resulting in deletion of the prefix.
This applies only if the first entry in idev->addr_list is the address to 
be deleted.

Signed-off-by: Kristian Slavov <kristian.slavov@nomadiclab.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-21 18:47:24 -08:00
Mika Kukkonen 7eb1b3d372 [VLAN]: Add two missing checks to vlan_ioctl_handler()
In vlan_ioctl_handler() the code misses couple checks for
error return values.

Signed-off-by: Mika Kukkonen <mikukkon@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-21 18:39:49 -08:00
Mika Kukkonen 0d77d59f62 [NETROM]: Fix three if-statements in nr_state1_machine()
I found these while compiling with extra gcc warnings;
considering the indenting surely they are not intentional?

Signed-off-by: Mika Kukkonen <mikukkon@iki.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-21 18:38:26 -08:00
YOSHIFUJI Hideaki 6b3ae80a63 [IPV6]: Don't select a tentative address as a source address.
A tentative address is not considered "assigned to an interface"
in the traditional sense (RFC2462 Section 4).
Don't try to select such an address for the source address.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-12-21 22:58:01 +09:00
YOSHIFUJI Hideaki c5e33bddd3 [IPV6]: Run DAD when the link becomes ready.
If the link was not available when the interface was created,
run DAD for pending tentative addresses when the link becomes ready.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-12-21 22:57:44 +09:00
YOSHIFUJI Hideaki 3c21edbd11 [IPV6]: Defer IPv6 device initialization until the link becomes ready.
NETDEV_UP might be sent even if the link attached to the interface was
not ready.  DAD does not make sense in such case, so we won't do so.
After interface

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-12-21 22:57:24 +09:00
YOSHIFUJI Hideaki 8de3351e6e [IPV6]: Try not to send icmp to anycast address.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-12-21 22:57:06 +09:00
YOSHIFUJI Hideaki 58c4fb86ea [IPV6]: Flag RTF_ANYCAST for anycast routes.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-12-21 22:56:42 +09:00
Trond Myklebust 48e4918775 SUNRPC: Fix "EPIPE" error on mount of rpcsec_gss-protected partitions
gss_create_upcall() should not error just because rpc.gssd closed the
 pipe on its end. Instead, it should requeue the pending requests and then
 retry.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-12-19 23:12:21 -05:00
Trond Myklebust b079fa7baa RPC: Do not block on skb allocation
If we get something like the following,
 [  125.300636]  [<c04086e1>] schedule_timeout+0x54/0xa5
 [  125.305931]  [<c040866e>] io_schedule_timeout+0x29/0x33
 [  125.311495]  [<c02880c4>] blk_congestion_wait+0x70/0x85
 [  125.317058]  [<c014136b>] throttle_vm_writeout+0x69/0x7d
 [  125.322720]  [<c014714d>] shrink_zone+0xe0/0xfa
 [  125.327560]  [<c01471d4>] shrink_caches+0x6d/0x6f
 [  125.332581]  [<c01472a6>] try_to_free_pages+0xd0/0x1b5
 [  125.338056]  [<c013fa4b>] __alloc_pages+0x135/0x2e8
 [  125.343258]  [<c03b74ad>] tcp_sendmsg+0xaa0/0xb78
 [  125.348281]  [<c03d4666>] inet_sendmsg+0x48/0x53
 [  125.353212]  [<c0388716>] sock_sendmsg+0xb8/0xd3
 [  125.358147]  [<c0388773>] kernel_sendmsg+0x42/0x4f
 [  125.363259]  [<c038bc00>] sock_no_sendpage+0x5e/0x77
 [  125.368556]  [<c03ee7af>] xs_tcp_send_request+0x2af/0x375
 then the socket is blocked until memory is reclaimed, and no
 progress can ever be made.

 Try to access the emergency pools by using GFP_ATOMIC.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-12-19 23:11:54 -05:00
Neil Horman 9bffc4ace1 [SCTP]: Fix sctp to not return erroneous POLLOUT events.
Make sctp_writeable() use sk_wmem_alloc rather than sk_wmem_queued to
determine the sndbuf space available. It also removes all the modifications
to sk_wmem_queued as it is not currently used in SCTP.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:24:40 -08:00
David S. Miller 399c180ac5 [IPSEC]: Perform SA switchover immediately.
When we insert a new xfrm_state which potentially
subsumes an existing one, make sure all cached
bundles are flushed so that the new SA is used
immediately.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:23:23 -08:00
Patrick McHardy 9e999993c7 [XFRM]: Handle DCCP in xfrm{4,6}_decode_session
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:03:46 -08:00
YOSHIFUJI Hideaki 3dd4bc68fa [IPV6]: Fix route lifetime.
The route expiration time is stored in rt6i_expires in jiffies.
The argument of rt6_route_add() for adding a route is not the
expiration time in jiffies nor in clock_t, but the lifetime
(or time left before expiration) in clock_t.

Because of the confusion, we sometimes saw several strange errors
(FAILs) in TAHI IPv6 Ready Logo Phase-2 Self Test.
The symptoms were analyzed by Mitsuru Chinen <CHINEN@jp.ibm.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:02:45 -08:00
Bart De Schuymer b03664869a [BRIDGE-NF]: Fix bridge-nf ipv6 length check
A typo caused some bridged IPv6 packets to get dropped randomly,
as reported by Sebastien Chaumontet. The patch below fixes this
(using skb->nh.raw instead of raw) and also makes the jumbo packet
length checking up-to-date with the code in
net/ipv6/exthdrs.c::ipv6_hop_jumbo.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 14:00:08 -08:00
Patrick McHardy 31cb5bd4dc [NETFILTER]: Fix incorrect dependency for IP6_NF_TARGET_NFQUEUE
IP6_NF_TARGET_NFQUEUE depends on IP6_NF_IPTABLES, not IP_NF_IPTABLES.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 13:53:26 -08:00
Patrick McHardy 0476f171af [NETFILTER]: Fix NAT init order
As noticed by Phil Oester, the GRE NAT protocol helper is initialized
before the NAT core, which makes registration fail.

Change the linking order to make NAT be initialized first.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-19 13:53:09 -08:00
Jeff Garzik 418fbfe979 Merge branch 'master' 2005-12-19 00:09:53 -05:00
Al Viro d3a880e1ff [PATCH] Address of void __user * is void __user * *, not void * __user *
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-15 10:04:31 -08:00
Stephen Hemminger a388442c37 [VLAN]: Fix hardware rx csum errors
Receiving VLAN packets over a device (without VLAN assist) that is
doing hardware checksumming (CHECKSUM_HW), causes errors because the
VLAN code forgets to adjust the hardware checksum.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-14 16:23:16 -08:00
Herbert Xu 1542272a60 [GRE]: Fix hardware checksum modification
The skb_postpull_rcsum introduced a bug to the checksum modification.
Although the length pulled is offset bytes, the origin of the pulling
is the GRE header, not the IP header.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-14 12:55:24 -08:00
David S. Miller 2edc2689f8 [PKT_SCHED]: Disable debug tracing logs by default in packet action API.
Noticed by Andi Kleen.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-13 22:59:50 -08:00
David S. Miller a1493d9cd1 [IPV6] addrconf: Do not print device pointer in privacy log message.
Noticed by Andi Kleen, it is pointless to emit the device
structure pointer in the kernel logs like this.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-13 22:59:36 -08:00
Jeff Garzik 783e3385a1 Merge branch 'upstream-fixes' 2005-12-13 00:07:46 -05:00
Olaf Hering 1cf9e8a786 [PATCH] ieee80211_crypt_tkip depends on NET_RADIO
*** Warning: ".wireless_send_event" [net/ieee80211/ieee80211_crypt_tkip.ko] undefined!

Signed-off-by: Olaf Hering <olh@suse.de>

 net/ieee80211/Kconfig |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-12-12 23:59:28 -05:00
Linus Torvalds 14ee0a1414 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/nf-2.6 2005-12-12 15:49:56 -08:00
Marcus Sundberg 2f9616d4c4 [NETFILTER]: ip_nat_tftp: Fix expectation NAT
When a TFTP client is SNATed so that the port is also changed, the
port is never changed back for the expected connection.

Signed-off-by: Marcus Sundberg <marcus@ingate.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-12 15:02:48 -08:00
Arnaldo Carvalho de Melo ecc51b6d5c [TCPv6]: Fix skb leak
Spotted by Francois Romieu, thanks!

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-12 14:38:10 -08:00
Jeff Garzik b1086eef81 Merge branch 'master' 2005-12-12 15:24:45 -05:00
Kazunori MIYAZAWA 73d4f84fd0 [IPv6] IPsec: fix pmtu calculation of esp
It is a simple bug which uses the wrong member.

This bug does not seriously affect ordinary use of IPsec.
But it is important to pass IPv6 ready logo phase-2
conformance test of IPsec SGW.

Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-08 23:11:42 -08:00
Stephen Hemminger 246a421207 [NET]: Fix NULL pointer deref in checksum debugging.
The problem I was seeing turned out to be that skb->dev is NULL when
the checksum is being completed in user context. This happens because
the reference to the device is dropped (to allow it to be released
when packets are in the queue).

Because skb->dev was NULL, the netdev_rx_csum_fault was panicing on
deref of dev->name. How about this?

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-08 15:21:39 -08:00
David S. Miller 4ebf0ae261 [AF_PACKET]: Convert PACKET_MMAP over to vm_insert_page().
So we can properly use __GFP_COMP and avoid the use of
PG_reserved pages.

With extremely helpful review from Hugh Dickins.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-06 16:38:35 -08:00
David S. Miller dfb4b9dceb [TCP] Vegas: timestamp before clone
We have to store the congestion control timestamp on the SKB before we
clone it, not after.  Else we get no timestamping information at all.

tcp_transmit_skb() has been reworked so that we can do the timestamp
still in one spot, instead of at all the call sites.

Problem discovered, and initial fix, from Tom Young
<tyo@ee.unimelb.edu.au>.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-06 16:24:52 -08:00
Thomas Young 0d7bef600a [TCP] Vegas: Remove extra call to tcp_vegas_rtt_calc
Remove unneeded call to tcp_vegas_rtt_calc. The more accurate
microsecond value has already been registered prior to calling
tcp_vegas_cong_avoid.

Signed-off-by: Thomas Young <tyo@ee.mu.oz.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-06 16:17:11 -08:00
Thomas Young 5b49561381 [TCP] Vegas: stop resetting rtt every ack
Move the resetting of rtt measurements to inside the once per RTT
block of code.

Signed-off-by: Thomas Young <tyo@ee.mu.oz.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-06 16:16:34 -08:00
Steven Whitehouse 1f12bcc9d1 [DECNET]: add memory buffer settings
The patch (originally from Steve) simply adds memory buffer settings to 
DECnet similar to those in TCP.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:42:06 -08:00
Martin Waitz dab9630fb3 [NET]: make function pointer argument parseable by kernel-doc
When a function takes a function pointer as argument it should use the 'return
(*pointer)(params...)' syntax used everywhere else in the kernel as this is
recognized by kernel-doc.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:40:12 -08:00
Patrick McHardy 2fdf1faa8e [NETFILTER]: Don't use conntrack entry after dropping the reference
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:38:16 -08:00
Patrick McHardy 266c854348 [NETFILTER]: Fix unbalanced read_unlock_bh in ctnetlink
NFA_NEST calls NFA_PUT which jumps to nfattr_failure if the skb has no
room left. We call read_unlock_bh at nfattr_failure for the NFA_PUT inside
the locked section, so move NFA_NEST inside the locked section too.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:37:33 -08:00
Patrick McHardy 6636568cf8 [NETFILTER]: Wait for untracked references in nf_conntrack module unload
Noticed by Pablo Neira <pablo@eurodev.net>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:36:50 -08:00
Patrick McHardy a795756333 [NETFILTER]: Mark ctnetlink as EXPERIMENTAL
Should have been marked EXPERIMENTAL from the beginning, as the current
bunch of fixes show.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:36:25 -08:00
Patrick McHardy 0be7fa92ca [NETFILTER]: Fix CTA_PROTO_NUM attribute size in ctnetlink
CTA_PROTO_NUM is a u_int8_t.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:34:51 -08:00
Patrick McHardy afe5c6bb03 [NETFILTER]: Fix ip_conntrack_flush abuse in ctnetlink
ip_conntrack_flush() used to be part of ip_conntrack_cleanup(), which needs
to drop _all_ references on module unload. Table flushed using ctnetlink
just needs to clean the table and doesn't need to flush the event cache or
wait for any references attached to skbs. Move everything but pure table
flushing back to ip_conntrack_cleanup().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:33:50 -08:00
Yasuyuki Kozakai 3ebbe0cdd4 [NETFILTER]: nfnetlink: Fix calculation of minimum message length
At least, valid nfnetlink message should have nlmsghdr and nfgenmsg.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:33:26 -08:00
Yasuyuki Kozakai f16c910724 [NETFILTER]: nf_conntrack: Fix missing check for ICMPv6 type
This makes nf_conntrack_icmpv6 check that ICMPv6 type isn't < 128
to avoid accessing out of array valid_new[] and invmap[].

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:32:50 -08:00
Pablo Neira Ayuso 8d1ca69984 [NETFILTER]: Fix incorrect argument to ip_nat_initialized() in ctnetlink
ip_nat_initialized() takes enum ip_nat_manip_type as it's second argument,
not a hook number.

Noticed and initial patch by Marcus Sundberg <marcus@ingate.com>.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:32:14 -08:00
Jeff Garzik 2fde9901f6 Merge branch 'master' 2005-12-03 21:03:28 -05:00
Trond Myklebust bb184f3356 SUNRPC: Fix Oopsable condition in rpc_pipefs
The elements on rpci->in_upcall are tracked by the filp->private_data,
 which will ensure that they get released when the file is closed.

 The exception is if rpc_close_pipes() gets called first, since that
 sets rpci->ops to NULL.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-12-03 15:20:10 -05:00
YOSHIFUJI Hideaki af1afe8662 [IPV6]: Load protocol module dynamically.
[ Modified to match inet_create() bug fix by Herbert Xu -DaveM ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-02 20:56:57 -08:00
Herbert Xu 86c8f9d158 [IPV4] Fix EPROTONOSUPPORT error in inet_create
There is a coding error in inet_create that causes it to always return
ESOCKTNOSUPPORT.  It should return EPROTONOSUPPORT when there are
protocols registered for a given socket type but none of them match
the requested protocol.

This is based on a patch by Jayachandran C.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-02 20:43:26 -08:00
David Stevens 24c6927505 [IGMP]: workaround for IGMP v1/v2 bug
From: David Stevens <dlstevens@us.ibm.com>

As explained at:

	http://www.cs.ucsb.edu/~krishna/igmp_dos/

With IGMP version 1 and 2 it is possible to inject a unicast
report to a client which will make it ignore multicast
reports sent later by the router.

The fix is to only accept the report if is was sent to a
multicast or unicast address.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-02 20:32:59 -08:00
Neil Horman bf031fff1f [SCTP]: Fix getsockname for sctp when an ipv6 socket accepts a connection from
an ipv4 socket.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-02 20:32:29 -08:00
Neil Horman 6736dc35e9 [SCTP]: Return socket errors only if the receive queue is empty.
This patch fixes an issue where it is possible to get valid data after
a ENOTCONN error. It returns socket errors only after data queued on
socket receive queue is consumed.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-02 20:30:06 -08:00
Thomas Graf ea86575eaf [NETLINK]: Fix processing of fib_lookup netlink messages
The receive path for fib_lookup netlink messages is lacking sanity
checks for header and payload and is thus vulnerable to malformed
netlink messages causing illegal memory references.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-01 14:30:00 -08:00
Phil Oester 2a43c4af3f [NETFILTER]: Fix recent match jiffies wrap mismatches
Around jiffies wrap time (i.e. within first 5 mins after boot), recent
match rules which contain both --seconds and --hitcount arguments
experience false matches.

This is because the last_pkts array is filled with zeros on creation, and
when comparing 'now' to 0 (+ --seconds argument), time_before_eq thinks it
has found a hit.

Below patch adds a break if the packet value is zero.  This has the
unfortunate side effect of causing mismatches if a packet was received
when jiffies really was equal to zero.  The odds of that happening are
slim compared to the problems caused by not adding the break however.
Plus, the author used this same method just below, so it is "good enough".

This fixes netfilter bugs #383 and #395.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-01 14:29:24 -08:00
Jozsef Kadlecsik 73f306024c [NETFILTER]: Ignore ACKs ACKs on half open connections in TCP conntrack
Mounting NFS file systems after a (warm) reboot could take a long time if
firewalling and connection tracking was enabled.

The reason is that the NFS clients tends to use the same ports (800 and
counting down). Now on reboot, the server would still have a TCB for an
existing TCP connection client:800 -> server:2049. The client sends a
SYN from port 800 to server:2049, which elicits an ACK from the server.
The firewall on the client drops the ACK because (from its point of
view) the connection is still in half-open state, and it expects to see
a SYNACK.

The client will eventually time out after several minutes.

The following patch corrects this, by accepting ACKs on half open
connections as well.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-01 14:28:58 -08:00
Jeff Garzik e538af42e4 Merge branch 'master' 2005-12-01 01:54:02 -05:00
Adrian Bunk 34a0b3cdc0 [IPV6]: make two functions static
This patch makes two needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:28:56 -08:00
Adrian Bunk d127e94a5c [NETFILTER] ipv4: small cleanups
This patch contains the following cleanups:
- make needlessly global code static
- ip_conntrack_core.c: ip_conntrack_flush() -> ip_conntrack_flush(void)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:28:18 -08:00
Adrian Bunk 4b30b1c6a3 [IPV4]: make two functions static
This patch makes two needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:27:20 -08:00
Arjan van de Ven 9b5b5cff9a [NET]: Add const markers to various variables.
the patch below marks various variables const in net/; the goal is to
move them to the .rodata section so that they can't false-share
cachelines with things that get written to, as well as potentially
helping gcc a bit with optimisations.  (these were found using a gcc
patch to warn about such variables)

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:21:38 -08:00
Stanislaw Gruszka 64bf69ddff [ATM]: deregistration removes device from atm_devs list immediately
atm_dev_deregister() removes device from atm_dev list immediately to
prevent operations on a phantom device.  Decision to free device based
only on ->refcnt  now. Remove shutdown_atm_dev() use atm_dev_deregister()
instead.  atm_dev_deregister() also asynchronously releases all vccs
related to device.

Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:16:41 -08:00
Stanislaw Gruszka aaaaaadbe7 [ATM]: avoid race conditions related to atm_devs list
Use semaphore to protect atm_devs list, as no one need access to it from
interrupt context.  Avoid race conditions between atm_dev_register(),
atm_dev_lookup() and atm_dev_deregister().  Fix double spin_unlock() bug.

Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:16:21 -08:00
Mitchell Blank Jr 50accc9c42 [ATM]: attempt to autoload atm drivers
From: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:15:18 -08:00
Mitchell Blank Jr c219750b2e [ATM]: atm_pcr_goal() doesn't modify its argument's contents -- mark it as const
Signed-off-by: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:13:55 -08:00
Mitchell Blank Jr c9933d0856 [ATM]: always return the first interface for ATM_ITF_ANY
From: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:13:32 -08:00
Mike Stroyan 18955cfcb2 [IPV4] tcp/route: Another look at hash table sizes
The tcp_ehash hash table gets too big on systems with really big memory.
It is worse on systems with pages larger than 4KB.  It wastes memory that
could be better used.  It also makes the netstat command slow because reading
/proc/net/tcp and /proc/net/tcp6 needs to go through the full hash table.

  The default value should not be larger for larger page sizes.  It seems
that the effect of page size is an unintended error dating back a long
time.  I also wonder if the default value really should be a larger
fraction of memory for systems with more memory.  While systems with
really big ram can afford more space for hash tables, it is not clear to
me that they benefit from increasing the allocation ratio for this table.

  The amount of memory allocated is determined by net/ipv4/tcp.c:tcp_init and
mm/page_alloc.c:alloc_large_system_hash.

tcp_init calls alloc_large_system_hash passing parameters-
    bucketsize=sizeof(struct tcp_ehash_bucket)
    numentries=thash_entries
    scale=(num_physpages >= 128 * 1024) ? (25-PAGE_SHIFT) : (27-PAGE_SHIFT)
    limit=0

On i386, PAGE_SHIFT is 12 for a page size of 4K
On ia64, PAGE_SHIFT defaults to 14 for a page size of 16K

The num_physpages test above makes the allocation take a larger fraction
of the total memory on systems with larger memory.  The threshold size
for a i386 system is 512MB.  For an ia64 system with 16KB pages the
threshold is 2GB.

For smaller memory systems-
On i386, scale = (27 - 12) = 15
On ia64, scale = (27 - 14) = 13
For larger memory systems-
On i386, scale = (25 - 12) = 13
On ia64, scale = (25 - 14) = 11

  For the rest of this discussion, I'll just track the larger memory case.

  The default behavior has numentries=thash_entries=0, so the allocated
size is determined by either scale or by the default limit of 1/16 of
total memory.

In alloc_large_system_hash-
|	numentries = (flags & HASH_HIGHMEM) ? nr_all_pages : nr_kernel_pages;
|	numentries += (1UL << (20 - PAGE_SHIFT)) - 1;
|	numentries >>= 20 - PAGE_SHIFT;
|	numentries <<= 20 - PAGE_SHIFT;

  At this point, numentries is pages for all of memory, rounded up to the
nearest megabyte boundary.

|	/* limit to 1 bucket per 2^scale bytes of low memory */
|	if (scale > PAGE_SHIFT)
|		numentries >>= (scale - PAGE_SHIFT);
|	else
|		numentries <<= (PAGE_SHIFT - scale);

On i386, numentries >>= (13 - 12), so numentries is 1/8196 of
bytes of total memory.
On ia64, numentries <<= (14 - 11), so numentries is 1/2048 of
bytes of total memory.

|        log2qty = long_log2(numentries);
|
|        do {
|                size = bucketsize << log2qty;

bucketsize is 16, so size is 16 times numentries, rounded
down to a power of two.

On i386, size is 1/512 of bytes of total memory.
On ia64, size is 1/128 of bytes of total memory.

For smaller systems the results are
On i386, size is 1/2048 of bytes of total memory.
On ia64, size is 1/512 of bytes of total memory.

  The large page effect can be removed by just replacing
the use of PAGE_SHIFT with a constant of 12 in the calls to
alloc_large_system_hash.  That makes them more like the other uses of
that function from fs/inode.c and fs/dcache.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29 16:12:55 -08:00
Jeff Garzik 2226340eb8 Merge branch 'master' 2005-11-29 03:50:33 -05:00
YOSHIFUJI Hideaki 220bbd7483 [IPV6]: Implement appropriate dummy rule 4 in ipv6_dev_get_saddr().
Ensure to update hiscore.rule in dummy rule 4 in ipv6_dev_get_saddr().
Pointed out by Yan Zheng <yanzheng@21cn.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-28 22:27:11 -08:00
Trond Myklebust b3eb67a2ab SUNRPC: Funny looking code in __rpc_purge_upcall
In __rpc_purge_upcall (net/sunrpc/rpc_pipe.c), the newer code to clean up
 the in_upcall list has a typo.
 Thanks to Vince Busam <vbusam@google.com> for spotting this!

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-25 17:11:30 -05:00
Olaf Rempel 133747e8d1 [BRIDGE]: recompute features when adding a new device
We must recompute bridge features everytime the list of underlying 
devices changes, or we might end up with features that are not
supported by all devices (eg. NETIF_F_TSO)
This patch adds the missing recompute when adding a device to the bridge.

Signed-off-by: Olaf Rempel <razzor@kopf-tisch.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-23 19:04:08 -08:00
Benoit Boissinot de919820cf [NETFILTER]: ip_conntrack_netlink.c needs linux/interrupt.h
net/ipv4/netfilter/ip_conntrack_netlink.c: In function 'ctnetlink_dump_table':
net/ipv4/netfilter/ip_conntrack_netlink.c:409: warning: implicit declaration of function 'local_bh_disable'
net/ipv4/netfilter/ip_conntrack_netlink.c:427: warning: implicit declaration of function 'local_bh_enable'

Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-23 19:03:46 -08:00
Pablo Neira Ayuso 00cb277a4a [NETFILTER] ctnetlink: Fix refcount leak ip_conntrack/nat_proto
Remove proto == NULL checking since ip_conntrack_[nat_]proto_find_get
always returns a valid pointer.

Fix missing ip_conntrack_proto_put in some paths.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-22 14:54:34 -08:00
Jamal Hadi Salim 0ff60a4567 [IPV4]: Fix secondary IP addresses after promotion
This patch fixes the problem with promoting aliases when:
a) a single primary and > 1 secondary addresses
b) multiple primary addresses each with at least one secondary address

Based on earlier efforts from Brian Pomerantz <bapper@piratehaven.org>,
Patrick McHardy <kaber@trash.net> and Thomas Graf <tgraf@suug.ch>

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-22 14:47:37 -08:00
Herbert Xu c27bd492fd [NETLINK]: Use tgid instead of pid for nlmsg_pid
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-22 14:41:50 -08:00
Patrick McHardy a516b04950 [DCCP]: Add missing no_policy flag to struct net_protocol
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 21:16:13 -08:00
Yasuyuki Kozakai 2b8f2ff6f4 [NETFILTER]: fixed dependencies between modules related with ip_conntrack
- IP_NF_CONNTRACK_MARK is bool and depends on only IP_NF_CONNTRACK
  which is tristate. If a variable depends on IP_NF_CONNTRACK_MARK and
  doesn't care about IP_NF_CONNTRACK, it can be y. This must be avoided.
- IP_NF_CT_ACCT has same problem.
- IP_NF_TARGET_CLUSTERIP also depends on IP_NF_MANGLE.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 21:09:55 -08:00
Patrick McHardy c9e53cbe7a [FIB_TRIE]: Don't show local table in /proc/net/route output
Don't show local table to behave similar to fib_hash.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 21:09:00 -08:00
David S. Miller 1ef43204f4 Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+advapi-fix/ 2005-11-20 20:52:16 -08:00
Yan Zheng 5d5780df23 [IPV6]: Acquire addrconf_hash_lock for read in addrconf_verify(...)
addrconf_verify(...) only traverse address hash table when
addrconf_hash_lock is held for writing, and it may hold
addrconf_hash_lock for a long time. So I think it's better to acquire
addrconf_hash_lock for reading instead of writing

Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 13:42:20 -08:00
Kris Katterjohn fb0d366b08 [NET]: Reject socket filter if division by constant zero is attempted.
This way we don't have to check it in sk_run_filter().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 13:41:34 -08:00
Andrea Bittau aa8751667d [PKT_SCHED]: sch_netem: correctly order packets to be sent simultaneously
If two packets were queued to be sent at the same time in the future,
their order would be reversed.  This would occur because the queue is
traversed back to front, and a position is found by checking whether
the new packet needs to be sent before the packet being examined.  If
the new packet is to be sent at the same time of a previous packet, it
would end up before the old packet in the queue.  This patch places
packets in the correct order when they are queued to be sent at a same
time in the future.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 13:41:05 -08:00
YOSHIFUJI Hideaki df9890c31a [IPV6]: Fix sending extension headers before and including routing header.
Based on suggestion from Masahide Nakamura <nakam@linux-ipv6.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-11-20 12:23:18 +09:00
Ville Nuorvala a305989386 [IPV6]: Fix calculation of AH length during filling ancillary data.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-11-20 12:21:59 +09:00
YOSHIFUJI Hideaki 8b8aa4b5a6 [IPV6]: Fix memory management error during setting up new advapi sockopts.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-11-20 12:18:17 +09:00
Jeff Garzik 638cbac8de Merge branch 'master' 2005-11-18 13:23:21 -05:00
David S. Miller 9e147a1cfc [IPV6]: Fib dump really needs GFP_ATOMIC.
Revert: 8225ccbaf0

Based upon a report by Yan Zheng.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-17 16:52:51 -08:00
Roman Zippel 05b8b0fafd [NET]: Sanitize NET_SCHED protection in /net/sched/Kconfig
On Thu, 17 Nov 2005, David Gmez wrote:

> I found out that if i select NET_CLS_ROUTE4, save my changes and exit
> menuconfig, execute again make menuconfig and go to QoS options, then the new
> available options are visible. So menuconfig has some problem refreshing
> contents :?

No, they were there before too, but you have to go up one level to see 
them.

It's better in 2.6.15-rc1-git5, but the menu structure is still a little 
messed up, the patch below properly indents all menu entries.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-17 15:22:39 -08:00
David S. Miller 381998241f [LLC]: Fix compiler warnings introduced by TX window scaling changes.
Noticed by Olaf Hering.

The comparisons want a u8 here (the data type on the left-hand branch
is a u8 structure member, and the constant on the right-hand branch is
"~((u8) 128)"), but C turns it into an integer so we get:

net/llc/llc_c_ac.c: In function `llc_conn_ac_inc_npta_value':
net/llc/llc_c_ac.c:998: warning: comparison is always true due to limited range of data type
net/llc/llc_c_ac.c:999: warning: large integer implicitly truncated to unsigned type

Fix this up by explicitly recasting the right-hand branch constant
into a "u8" once more.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-17 15:17:42 -08:00
Harald Welte 2fce76afdb [NETFILTER] ip_conntrack: fix ftp/irc/tftp helpers on ports >= 32768
Since we've converted the ftp/irc/tftp helpers to use the new
module_parm_array() some time ago, we ware accidentially using signed data
types - thus preventing those modules from being used on ports >= 32768.

This patch fixes it by using 'ushort' module parameters.

Thanks to Jan Nijs for reporting this bug.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-17 15:06:47 -08:00
Stephen Hemminger bd6af700a7 [TCP]: TCP highspeed build error
There is a compile error that crept in with the last patch of
TCP patches.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-17 14:11:18 -08:00
Patrick McHardy 4a59a81051 [NETFILTER]: Fix nf_conntrack compilation with CONFIG_NETFILTER_DEBUG
CC [M]  net/netfilter/nf_conntrack_core.o
net/netfilter/nf_conntrack_core.c: In function 'nf_ct_unlink_expect':
net/netfilter/nf_conntrack_core.c:390: error: 'exp_timeout' undeclared (first use in this function)
net/netfilter/nf_conntrack_core.c:390: error: (Each undeclared identifier is reported only once
net/netfilter/nf_conntrack_core.c:390: error: for each function it appears in.)

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-16 23:14:19 -08:00
Yasuyuki Kozakai e7c8a41e81 [IPV4,IPV6]: replace handmade list with hlist in IPv{4,6} reassembly
Both of ipq and frag_queue have *next and **prev, and they can be replaced
with hlist. Thanks Arnaldo Carvalho de Melo for the suggestion.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-16 12:55:37 -08:00
Linus Torvalds f6ff56cd56 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-11-15 16:59:38 -08:00
KOVACS Krisztian 5a6f294e43 [NETFILTER] Free layer-3 specific protocol tables at cleanup
Although the comment around the allocation code tells us that
the layer-3 specific protocol tables will be freed when cleaning up,
they aren't. And this makes nfsim complain loudly...

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-15 16:47:34 -08:00
KOVACS Krisztian 96479376c8 [NETFILTER] Remove nf_conntrack stat proc file when cleaning up
Fix nf_conntrack statistics proc file removal. Looks like the old bug
was forward-ported from ip_conntrack. :-]

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-15 16:47:09 -08:00
Stephen Hemminger 31f3426904 [TCP]: More spelling fixes.
From Joe Perches

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-15 15:17:10 -08:00
NeilBrown 1887b93529 [PATCH] knfsd: make sure nfsd doesn't hog a cpu forever
Being kernel-threads, nfsd servers don't get pre-empted (depending on
CONFIG).  If there is a steady stream of NFS requests that can be served
from cache, an nfsd thread may hold on to a cpu indefinitely, which isn't
very friendly.

So it is good to have a cond_resched in there (just before looking for a
new request to serve), to make sure we play nice.

Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15 08:59:19 -08:00
Jeff Garzik f055408957 Merge branch 'master' 2005-11-15 04:51:40 -05:00
Jochen Friedrich 451677c46f [LLC]: Make core block on remote busy.
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 21:57:46 -08:00
Jochen Friedrich 59c6196e59 [LLC]: Fix TX window scaling
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 21:57:15 -08:00
Luiz Capitulino cb422c464b [IPV6]: Fixes sparse warning in ipv6/ipv6_sockglue.c
The patch below fixes the following sparse warning:

net/ipv6/ipv6_sockglue.c:291:13: warning: Using plain integer as NULL pointer

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 21:43:36 -08:00
Yan Zheng 12da2a435c [IPV6]: small fix for ipv6_dev_get_saddr(...)
The "score.rule++" doesn't make any sense for me. 
According to codes above, I think it should be "hiscore.rule++;" .

Signed-off-by: Yan Zheng<yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 21:42:46 -08:00
Yasuyuki Kozakai 302fe1758d [NETFILTER] fix leak of fragment queue at unloading nf_conntrack_ipv6
This patch makes nf_conntrack_ipv6 free all IPv6 fragment queues at module
unloading time.  Also introduce a BUG_ON if we ever again have leaks in
the memory accounting.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:28:45 -08:00
Yasuyuki Kozakai 1ba430bc3e [NETFILTER] nf_conntrack: fix possibility of infinite loop while evicting nf_ct_frag6_queue
This synchronizes nf_ct_reasm with ipv6 reassembly, and fixes a possibility
of an infinite loop if CPUs evict and create nf_ct_frag6_queue in parallel.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:28:18 -08:00
Yasuyuki Kozakai 7686a02c0e [NETFILTER]: fix type of sysctl variables in nf_conntrack_ipv6
These variables should be unsigned.  This fixes sysctl handler for
nf_ct_frag6_{low,high}_thresh.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:27:43 -08:00
Yasuyuki Kozakai 9bdf87d90b [NETFILTER]: cleanup IPv6 Netfilter Kconfig
This removes linux 2.4 configs in comments as TODO lists.
And this also move the entry of nf_conntrack to top like IPv4 Netfilter
Kconfig.

Based on original patch by Krzysztof Piotr Oledzki <ole@ans.pl>.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:26:58 -08:00
Krzysztof Oledzki 47d4305bf2 [NETFILTER]: link 'netfilter' before ipv4
Staticaly linked nf_conntrack_ipv4 requires nf_conntrack. but currently
nf_conntrack is linked after it. This changes the order of ipv4 and netfilter
to fix this.

Signed-off-by: Krzysztof Oledzki <olenf@ans.pl>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:25:59 -08:00
Harald Welte 37d2e7a20d [NETFILTER] nfnetlink: unconditionally require CAP_NET_ADMIN
This patch unconditionally requires CAP_NET_ADMIN for all nfnetlink
messages.  It also removes the per-message cap_required field, since all
existing subsystems use CAP_NET_ADMIN for all their messages anyway.

Patrick McHardy owes me a beer if we ever need to re-introduce this.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:24:59 -08:00
KOVACS Krisztian 3746a2b140 [NETFILTER] nf_conntrack: Add missing code to TCP conntrack module
Looks like the nf_conntrack TCP code was slightly mismerged: it does
not contain an else branch present in the IPv4 version. Let's add that
code and make the testsuite happy.

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:23:01 -08:00
Pablo Neira Ayuso 5655820852 [NETFILTER] ctnetlink: More thorough size checking of attributes
Add missing size checks. Thanks Patrick McHardy for the hint.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:22:11 -08:00
Pablo Neira Ayuso dbd36ea496 [NETFILTER] ctnetlink: use size_t to make gcc-4.x happy
Make gcc-4.x happy. Use size_t instead of int. Thanks to Patrick McHardy
for the hint.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-14 15:21:01 -08:00
Mitch Williams c2373ee989 [PATCH] net: make dev_valid_name public
dev_valid_name() is a useful function.  Make it public.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2005-11-13 14:48:18 -05:00
Mitch Williams 1e2e565965 [PATCH] net: allow newline terminated IP addresses in in_aton
in_aton() gives weird results if it sees a newline at the end of the
input. This patch makes it able to handle such input correctly.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2005-11-13 14:48:17 -05:00
Thomas Graf 8225ccbaf0 [IPV6]: Fix unnecessary GFP_ATOMIC allocation in fib6 dump
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12 12:15:16 -08:00
Vlad Drukker a2d7222f0f [NETFILTER] {ip,nf}_conntrack TCP: Accept SYN+PUSH like SYN
Some devices (e.g. Qlogic iSCSI HBA hardware like QLA4010 up to firmware
3.0.0.4) initiates TCP with SYN and PUSH flags set.

The Linux TCP/IP stack deals fine with that, but the connection tracking
code doesn't.

This patch alters TCP connection tracking to accept SYN+PUSH as a valid
flag combination.

Signed-off-by: Vlad Drukker <vlad@storewiz.com>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12 12:13:14 -08:00
Herbert Xu efacfbcb6c [IPV6]: Fix rtnetlink dump infinite loop
The recent change to netlink dump "done" callback handling broke IPv6
which played dirty tricks with the "done" callback.  This causes an
infinite loop during a dump.

The following patch fixes it.

This bug was reported by Jeff Garzik.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12 12:12:05 -08:00
Neil Horman 049b3ff5a8 [SCTP]: Include ulpevents in socket receive buffer accounting.
Also introduces a sysctl option to configure the receive buffer
accounting policy to be either at socket or association level.
Default is all the associations on the same socket share the
receive buffer.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 16:08:24 -08:00
Vladislav Yasevich 1e7d3d90c9 [SCTP]: Remove timeouts[] array from sctp_endpoint.
The socket level timeout values are maintained in sctp_sock and
association level timeouts are in sctp_association. So there is
no need for ep->timeouts.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 16:06:16 -08:00
Vladislav Yasevich 23ec47a088 [SCTP]: Fix potential NULL pointer dereference in sctp_v4_get_saddr
It is possible to get to sctp_v4_get_saddr() without a valid
association.  This happens when processing OOTB packets and
the cached route entry is no longer valid.
However, when responding to OOTB packets we already properly
set the source address based on the information in the OOTB
packet.  So, if we we get to sctp_v4_get_saddr() without an
association we can simply return.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 16:05:55 -08:00
David S. Miller 8eb5591052 [IPV6]: Fix inet6_init missing unregister.
Based mostly upon a patch from Olaf Kirch <okir@suse.de>

When initialization fails in inet6_init(), we should
unregister the PF_INET6 socket ops.

Also, check sock_register()'s return value for errors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 15:05:47 -08:00
Patrick Caulfield 9eb5c94ef2 [DECNET]: fix SIGPIPE
Currently recvmsg generates SIGPIPE whereas sendmsg does not; for the
other stacks it seems to be the other way round!

It also fixes the bug where reading from a socket whose peer has shutdown
returned -EINVAL rather than 0.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 12:04:28 -08:00
Jeff Garzik c050970a25 [PATCH] TCP: fix vegas build
Recent TCP changes broke the build.

Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-11 09:21:28 -08:00
Stephen Hemminger 6a438bbe68 [TCP]: speed up SACK processing
Use "hints" to speed up the SACK processing. Various forms 
of this have been used by TCP developers (Web100, STCP, BIC)
to avoid the 2x linear search of outstanding segments.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 17:14:59 -08:00
Stephen Hemminger caa20d9abe [TCP]: spelling fixes
Minor spelling fixes for TCP code.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 17:13:47 -08:00
John Heffner 326f36e9e7 [TCP]: receive buffer growth limiting with mixed MTU
This is a patch for discussion addressing some receive buffer growing issues.
This is partially related to the thread "Possible BUG in IPv4 TCP window
handling..." last week.

Specifically it addresses the problem of an interaction between rcvbuf
moderation (receiver autotuning) and rcv_ssthresh.  The problem occurs when
sending small packets to a receiver with a larger MTU.  (A very common case I
have is a host with a 1500 byte MTU sending to a host with a 9k MTU.)  In
such a case, the rcv_ssthresh code is targeting a window size corresponding
to filling up the current rcvbuf, not taking into account that the new rcvbuf
moderation may increase the rcvbuf size.

One hunk makes rcv_ssthresh use tcp_rmem[2] as the size target rather than
rcvbuf.  The other changes the behavior when it overflows its memory bounds
with in-order data so that it tries to grow rcvbuf (the same as with
out-of-order data).

These changes should help my problem of mixed MTUs, and should also help the
case from last week's thread I think.  (In both cases though you still need
tcp_rmem[2] to be set much larger than the TCP window.)  One question is if
this is too aggressive at trying to increase rcvbuf if it's under memory
stress.

Orignally-from: John Heffner <jheffner@psc.edu>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 17:11:48 -08:00
Stephen Hemminger 9772efb970 [TCP]: Appropriate Byte Count support
This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.

The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 17:09:53 -08:00
Stephen Hemminger 7faffa1c7f [TCP]: add tcp_slow_start helper
Move all the code that does linear TCP slowstart to one
inline function to ease later patch to add ABC support.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 17:07:24 -08:00
Stephen Hemminger 2d2abbab63 [TCP]: simplify microsecond rtt sampling
Simplify the code that comuputes microsecond rtt estimate used
by TCP Vegas. Move the callback out of the RTT sampler and into
the end of the ack cleanup.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 16:56:12 -08:00
Stephen Hemminger f4805eded7 [TCP]: fix congestion window update when using TSO deferal
TCP peformance with TSO over networks with delay is awful.
On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
50Mbits/sec without TSO.

The problem is with TSO, we intentionally do not keep the maximum
number of packets in flight to fill the window, we hold out to until 
we can send a MSS chunk. But, we also don't update the congestion window 
unless we have filled, as per RFC2861.

This patch replaces the check for the congestion window being full
with something smarter that accounts for TSO.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 16:53:30 -08:00
Herbert Xu fb286bb299 [NET]: Detect hardware rx checksum faults correctly
Here is the patch that introduces the generic skb_checksum_complete
which also checks for hardware RX checksum faults.  If that happens,
it'll call netdev_rx_csum_fault which currently prints out a stack
trace with the device name.  In future it can turn off RX checksum.

I've converted every spot under net/ that does RX checksum checks to
use skb_checksum_complete or __skb_checksum_complete with the
exceptions of:

* Those places where checksums are done bit by bit.  These will call
netdev_rx_csum_fault directly.

* The following have not been completely checked/converted:

ipmr
ip_vs
netfilter
dccp

This patch is based on patches and suggestions from Stephen Hemminger
and David S. Miller.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 13:01:24 -08:00
Linus Torvalds b01a55a865 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-11-09 19:32:25 -08:00
Trond Myklebust 940e3318c3 [PATCH] SUNRPC: don't reencode when looping in call transmit.
If the call to xprt_transmit() fails due to socket buffer space
exhaustion, we do not need to re-encode the RPC message when we
loop back through call_transmit.

Re-encoding can actually end up triggering the WARN_ON() in
call_decode() if we re-encode something like a read() request and
auth->au_rslack has changed.
It can also cause us to increment the RPCSEC_GSS sequence number
beyond the limits of the allowed window.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 19:31:33 -08:00
Thomas Graf 482a8524f8 [NETLINK]: Generic netlink family
The generic netlink family builds on top of netlink and provides
simplifies access for the less demanding netlink users. It solves
the problem of protocol numbers running out by introducing a so
called controller taking care of id management and name resolving.

Generic netlink modules register themself after filling out their
id card (struct genl_family), after successful registration the
modules are able to register callbacks to command numbers by
filling out a struct genl_ops and calling genl_register_op(). The
registered callbacks are invoked with attributes parsed making
life of simple modules a lot easier.

Although generic netlink modules can request static identifiers,
it is recommended to use GENL_ID_GENERATE and to let the controller
assign a unique identifier to the module. Userspace applications
will then ask the controller and lookup the idenfier by the module
name.

Due to the current multicast implementation of netlink, the number
of generic netlink modules is restricted to 1024 to avoid wasting
memory for the per socket multiacst subscription bitmask.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:41 +01:00
Thomas Graf 9ac4a16983 [RTNETLINK]: Use generic netlink receive queue processor
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Thomas Graf 88fc2c8431 [XFRM]: Use generic netlink receive queue processor
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Thomas Graf 82ace47a72 [NETLINK]: Generic netlink receive queue processor
Introduces netlink_run_queue() to handle the receive queue of
a netlink socket in a generic way. Processes as much as there
was in the queue upon entry and invokes a callback function
for each netlink message found. The callback function may
refuse a message by returning a negative error code but setting
the error pointer to 0 in which case netlink_run_queue() will
return with a qlen != 0.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Thomas Graf a8f74b2288 [NETLINK]: Make netlink_callback->done() optional
Most netlink families make no use of the done() callback, making
it optional gets rid of all unnecessary dummy implementations.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Thomas Graf bfa83a9e03 [NETLINK]: Type-safe netlink messages/attributes interface
Introduces a new type-safe interface for netlink message and
attributes handling. The interface is fully binary compatible
with the old interface towards userspace. Besides type safety,
this interface features attribute validation capabilities,
simplified message contstruction, and documentation.

The resulting netlink code should be smaller, less error prone
and easier to understand.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Yasuyuki Kozakai 9fb9cbb108 [NETFILTER]: Add nf_conntrack subsystem.
The existing connection tracking subsystem in netfilter can only
handle ipv4.  There were basically two choices present to add
connection tracking support for ipv6.  We could either duplicate all
of the ipv4 connection tracking code into an ipv6 counterpart, or (the
choice taken by these patches) we could design a generic layer that
could handle both ipv4 and ipv6 and thus requiring only one sub-protocol
(TCP, UDP, etc.) connection tracking helper module to be written.

In fact nf_conntrack is capable of working with any layer 3
protocol.

The existing ipv4 specific conntrack code could also not deal
with the pecularities of doing connection tracking on ipv6,
which is also cured here.  For example, these issues include:

1) ICMPv6 handling, which is used for neighbour discovery in
   ipv6 thus some messages such as these should not participate
   in connection tracking since effectively they are like ARP
   messages

2) fragmentation must be handled differently in ipv6, because
   the simplistic "defrag, connection track and NAT, refrag"
   (which the existing ipv4 connection tracking does) approach simply
   isn't feasible in ipv6

3) ipv6 extension header parsing must occur at the correct spots
   before and after connection tracking decisions, and there were
   no provisions for this in the existing connection tracking
   design

4) ipv6 has no need for stateful NAT

The ipv4 specific conntrack layer is kept around, until all of
the ipv4 specific conntrack helpers are ported over to nf_conntrack
and it is feature complete.  Once that occurs, the old conntrack
stuff will get placed into the feature-removal-schedule and we will
fully kill it off 6 months later.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-09 16:38:16 -08:00
Ken-ichirou MATSUZAWA 9f0ede52a0 [IPV6]: ip6ip6_lock is not unlocked in error path.
From: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:08:29 -08:00
Peter Chubb 44fd0261d3 [IPV6]: Fix fallout from CONFIG_IPV6_PRIVACY
Trying to build today's 2.6.14+git snapshot gives undefined references
to use_tempaddr

Looks like an ifdef got left out.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:05:47 -08:00
Krzysztof Piotr Oledzki 5fd52fe098 [NETFILTER] ctnetlink: ICMP_ID is u_int16_t not u_int8_t.
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:04:32 -08:00
Krzysztof Piotr Oledzki 439a9994bb [NETFILTER] ctnetlink: Fix oops when no ICMP ID info in message
This patch fixes an userspace triggered oops. If there is no ICMP_ID
info the reference to attr will be NULL.

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:04:08 -08:00
Pablo Neira Ayuso a856a19a9f [NETFILTER] ctnetlink: Add support to identify expectations by ID's
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:03:42 -08:00
Pablo Neira Ayuso fcda46128d [NETFILTER] ctnetlink: propagate error instaed of returning -EPERM
Propagate the error to userspace instead of returning -EPERM if the get
conntrack operation fails.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:03:26 -08:00
Pablo Neira Ayuso fe902a91ff [NETFILTER] ctnetlink: return -EINVAL if size is wrong
Return -EINVAL if the size isn't OK instead of -EPERM.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:03:09 -08:00
Yasuyuki Kozakai d63a928108 [NETFILTER]: stop tracking ICMP error at early point
Currently connection tracking handles ICMP error like normal packets
if it failed to get related connection. But it fails that after all.

This makes connection tracking stop tracking ICMP error at early point.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:02:45 -08:00
Harald Welte ed77de9fc6 [NETFILTER] nfnetlink: only load subsystems if CAP_NET_ADMIN is set
Without this patch, any user can cause nfnetlink subsystems to be
autoloaded.  Those subsystems however could add significant processing
overhead to packet processing, and would refuse any configuration messages
from non-CAP_NET_ADMIN processes anyway.

This patch follows a suggestion from Patrick McHardy.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:02:16 -08:00
Philip Craig 5978a9b82c [NETFILTER] PPTP helper: fix PNS-PAC expectation call id
The reply tuple of the PNS->PAC expectation was using the wrong call id.

So we had the following situation:
- PNS behind NAT firewall
- PNS call id requires NATing
- PNS->PAC gre packet arrives first

then the PNS->PAC expectation is matched, and the other expectation
is deleted, but the PAC->PNS gre packets do not match the gre conntrack
because the call id is wrong.

We also cannot use ip_nat_follow_master().

Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:01:53 -08:00
Pablo Neira Ayuso 81e5c27d08 [NETFILTER] ctnetlink: get_conntrack can use GFP_KERNEL
ctnetlink_get_conntrack is always called from user context, so GFP_KERNEL
is enough.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:01:19 -08:00
Pablo Neira Ayuso 7a4fe3664b [NETFILTER] ctnetlink: kill unused includes
Kill some useless headers included in ctnetlink. They aren't used in any
way.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:00:47 -08:00
Pablo Neira Ayuso 119a318494 [NETFILTER] ctnetlink: add module alias to fix autoloading
Add missing module alias. This is a must to load ctnetlink on demand. For
example, the conntrack tool will fail if the module isn't loaded.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:00:29 -08:00
Pablo Neira Ayuso 02a78cdf42 [NETFILTER] ctnetlink: add marking support from userspace
This patch adds support for conntrack marking from user space.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 13:00:04 -08:00
Pablo Neira Ayuso 51df784ed7 [NETFILTER] ctnetlink: check if protoinfo is present
This fixes an oops triggered from userspace. If we don't pass information
about the private protocol info, the reference to attr will be NULL. This is
likely to happen in update messages.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:59:41 -08:00
Harald Welte a2506c0432 [NETFILTER] nfnetlink: nfattr_parse() can never fail, make it void
nfattr_parse (and thus nfattr_parse_nested) always returns success. So we
can make them 'void' and remove all the checking at the caller side.

Based on original patch by Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:59:13 -08:00
Yasuyuki Kozakai eaae4fa45e [NETFILTER]: refcount leak of proto when ctnetlink dumping tuple
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:58:46 -08:00
Yasuyuki Kozakai 46998f59c0 [NETFILTER]: packet counter of conntrack is 32bits
The packet counter variable of conntrack was changed to 32bits from 64bits.
This follows that change.
		    
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:58:05 -08:00
Linus Torvalds a7c243b544 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-11-09 08:34:36 -08:00
Christoph Hellwig 49705b7743 [PATCH] sanitize lookup_hash prototype
->permission and ->lookup have a struct nameidata * argument these days to
pass down lookup intents.  Unfortunately some callers of lookup_hash don't
actually pass this one down.  For lookup_one_len() we don't have a struct
nameidata to pass down, but as this function is a library function only
used by filesystem code this is an acceptable limitation.  All other
callers should pass down the nameidata, so this patch changes the
lookup_hash interface to only take a struct nameidata argument and derives
the other two arguments to __lookup_hash from it.  All callers already have
the nameidata argument available so this is not a problem.

At the same time I'd like to deprecate the lookup_hash interface as there
are better exported interfaces for filesystem usage.  Before it can
actually be removed I need to fix up rpc_pipefs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:00 -08:00
Christoph Hellwig e4543eddfd [PATCH] add a vfs_permission helper
Most permission() calls have a struct nameidata * available.  This helper
takes that as an argument and thus makes sure we pass it down for lookup
intents and prepares for per-mount read-only support where we need a struct
vfsmount for checking whether a file is writeable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:55:58 -08:00
Christoph Hellwig e3305626e0 ieee80211: cleanup crypto list handling, other minor cleanups. 2005-11-09 01:01:04 -05:00
Jeff Garzik f24e09754b Merge rsync://bughost.org/repos/ieee80211-delta/ 2005-11-09 00:00:29 -05:00
Marcel Holtmann be9d122730 [Bluetooth]: Remove the usage of /proc completely
This patch removes all relics of the /proc usage from the Bluetooth
subsystem core and its upper layers. All the previous information are
now available via /sys/class/bluetooth through appropriate functions.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:57:38 -08:00
Marcel Holtmann 1ebb92521d [Bluetooth]: Add endian annotations to the core
This patch adds the endian annotations to the Bluetooth core.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:57:21 -08:00
Herbert Xu 89f5f0aeed [IPV4]: Fix ip_queue_xmit identity increment for TSO packets
When ip_queue_xmit calls ip_select_ident_more for IP identity selection
it gives it the wrong packet count for TSO packets.  The ip_select_*
functions expect one less than the number of packets, so we need to
subtract one for TSO packets.

This bug was diagnosed and fixed by Tom Young.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:41:56 -08:00
Jesper Juhl a51482bde2 [NET]: kfree cleanup
From: Jesper Juhl <jesper.juhl@gmail.com>

This is the net/ part of the big kfree cleanup patch.

Remove pointless checks for NULL prior to calling kfree() in net/.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2005-11-08 09:41:34 -08:00
Julian Anastasov dc8103f25f [IPVS]: fix connection leak if expire_nodest_conn=1
There was a fix in 2.6.13 that changed the behaviour of
ip_vs_conn_expire_now function not to put reference to connection,
its callers should hold write lock or connection refcnt. But we
forgot to convert one caller, when the real server for connection
is unavailable caller should put the connection reference. It
happens only when sysctl var expire_nodest_conn is set to 1 and
such connections never expire. Thanks to Roberto Nibali who found
the problem and tested a 2.4.32-rc2 patch, which is equal to this
2.6 version. Patch for 2.4 is already sent to Marcelo.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Roberto Nibali <ratz@drugphish.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:40:05 -08:00
Thomas Graf b541ca2c5a [PKT_SCHED]: Correctly handle empty ematch trees
Fixes an invalid memory reference when the basic classifier
is used without any ematches but just actions.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:39:17 -08:00
YOSHIFUJI Hideaki 072047e4de [IPV6]: RFC3484 compliant source address selection
Choose more appropriate source address; e.g.
 - outgoing interface
 - non-deprecated
 - scope
 - matching label

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:38:30 -08:00
YOSHIFUJI Hideaki b1cacb6820 [IPV6]: Make ipv6_addr_type() more generic so that we can use it for source address selection.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:38:12 -08:00
YOSHIFUJI Hideaki 971f359ddc [IPV6]: Put addr_diff() into common header for future use.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-08 09:37:56 -08:00
Jeff Garzik 3133c5e896 Merge git://git.tuxdriver.com/git/netdev-jwl 2005-11-07 22:54:48 -05:00
Adrian Bunk fd7a516efb [PATCH] fix NET_RADIO=n, IEEE80211=y compile
This patch fixes the following compile error with CONFIG_NET_RADIO=n and
CONFIG_IEEE80211=y:

  LD      .tmp_vmlinux1
net/built-in.o: In function `ieee80211_rx':
: undefined reference to `wireless_spy_update'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2005-11-07 21:50:00 -05:00
Volker Braun e189277a3f Fix problem with WEP unicast key > index 0
The functions ieee80211_wx_{get,set}_encodeext fail if one tries to set
unicast (IW_ENCODE_EXT_GROUP_KEY not set) keys at key indices>0. But at
least some Cisco APs dish out dynamic WEP unicast keys at index !=0.

Signed-off-by: Volker Braun <volker.braun@physik.hu-berlin.de>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-11-07 16:19:02 -06:00
James Ketrenos 81f875208e scripts/Lindent on ieee80211 subsystem.
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-11-07 16:18:48 -06:00
Linus Torvalds 8e33ba4976 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2005-11-07 08:05:11 -08:00
Linus Torvalds 8cde0776ec Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-11-07 08:04:01 -08:00
NeilBrown 80d188a643 [PATCH] knfsd: make sure svc_process call the correct pg_authenticate for multi-service port
If an RPC socket is serving multiple programs, then the pg_authenticate of
the first program in the list is called, instead of pg_authenticate for the
program to be run.

This does not cause a problem with any programs in the current kernel, but
could confuse future code.

Also set pg_authenticate for nfsd_acl_program incase it ever gets used.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:49 -08:00
Jeff Garzik a10b5aacea Remove linux/version.h include from drivers/net/phy/* and net/ieee80211/*.
Unused, and causes the files to be needlessly rebuilt in some cases.
2005-11-05 23:39:54 -05:00
Arnaldo Carvalho de Melo 2d43f1128a Merge branch 'red' of 84.73.165.173:/home/tgr/repos/net-2.6 2005-11-05 22:30:29 -02:00
Stephen Hemminger 6df716340d [TCP/DCCP]: Randomize port selection
This patch randomizes the port selected on bind() for connections
to help with possible security attacks. It should also be faster
in most cases because there is no need for a global lock.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 21:23:15 -02:00
Herbert Xu 6151b31c96 [NET]: Fix race condition in sk_stream_wait_connect
When sk_stream_wait_connect detects a state transition to ESTABLISHED
or CLOSE_WAIT prior to it going to sleep, it will return without
calling finish_wait and decrementing sk_write_pending.

This may result in crashes and other unintended behaviour.

The fix is to always call finish_wait and update sk_write_pending since
it is safe to do so even if the wait entry is no longer on the queue.

This bug was tracked down with the help of Alex Sidorenko and the
fix is also based on his suggestion.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 21:05:20 -02:00
Stephen Hemminger eb229c4cdc [NETEM]: Add version string
Add a version string to help support issues.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 20:59:21 -02:00
Stephen Hemminger 300ce174eb [NETEM]: Support time based reordering
Change netem to support packets getting reordered because of variations in
delay. Introduce a special case version of FIFO that queues packets in order
based on the netem delay.

Since netem is classful, those users that don't want jitter based reordering
can just insert a pfifo instead of the default.

This required changes to generic skbuff code to allow finer grain manipulation
of sk_buff_head.  Insertion into the middle and reverse walk.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 20:56:41 -02:00
Thomas Graf bdc450a0bb [PKT_SCHED]: (G)RED: Introduce hard dropping
Introduces a new flag TC_RED_HARDDROP which specifies that if ECN
marking is enabled packets should still be dropped once the
average queue length exceeds the maximum threshold.

This _may_ help to avoid global synchronisation during small
bursts of peers advertising but not caring about ECN. Use this
option very carefully, it does more harm than good if
(qth_max - qth_min) does not cover at least two average burst
cycles.

The difference to the current behaviour, in which we'd run into
the hard queue limit, is that due to the low pass filter of RED
short bursts are less likely to cause a global synchronisation.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:29 +01:00
Thomas Graf b38c7eef7e [PKT_SCHED]: GRED: Support ECN marking
Adds a new u8 flags in a unused padding area of the netlink
message. Adds ECN marking support to be used instead of dropping
packets immediately.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:29 +01:00
Thomas Graf d8f64e1960 [PKT_SCHED]: GRED: Fix restart of idle period in WRED mode upon dequeue and drop
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Thomas Graf 1e4dfaf9b9 [PKT_SCHED]: GRED: Cleanup and remove unnecessary code
Removes unnecessary includes, initializers, and simplifies
the code a bit.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Thomas Graf 6214e653cc [PKT_SCHED]: GRED: Remove auto-creation of default VQ
Since we are no longer depending on the default VQ to be always
allocated we can leave it up to the user to actually create it.
This gives the user the ability to leave it out on purpose and
enqueue packets directly to the device without applying the RED
algorithm.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Thomas Graf 7051703b99 [PKT_SCHED]: GRED: Dont abuse default VQ for equalizing
Introduces a new red parameter set for use in equalize mode,
although only the qavg variable and the idle period marker are
being used for now this makes it possible to allow a separate
parameter set to be used for equalize later on.

The use of this separate parameter set fixes a bogus start of
an idle period in gred_drop() which did start an idle period
on the default VQ even if equalize mode was disabled.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Thomas Graf 4a591834cf [PKT_SCHED]: GRED: Remove initd flag
The case when the default VQ is not set up yet is already handled
in a less error prone way.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Thomas Graf 18e3fb84e6 [PKT_SCHED]: GRED: Improve error handling and messages
Try to enqueue packets if we cannot associate it with a VQ, this
basically means that the default VQ has not been set up yet.

We must check if the VQ still exists while requeueing, the VQ
might have been changed between dequeue and the requeue of the
underlying qdisc.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Thomas Graf 716a1b40b0 [PKT_SCHED]: GRED: Introduce tc_index_to_dp()
Adds a transformation function returning the DP index for a
given skb according to its tc_index.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:27 +01:00
Thomas Graf edf7a7b1f0 [PKT_SCHED]: GRED: Use generic queue management interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:27 +01:00
Thomas Graf c3b553cdaf [PKT_SCHED]: GRED: Report congestion related drops as NET_XMIT_CN
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:27 +01:00
Thomas Graf 301d063c29 [PKT_SCHED]: GRED: Do not reset statistics in gred_reset/gred_change
Qdiscs are not supposed to reset statistics in reset() and while
changing parameters. My argumentation is that if the user wants
the counters to be reset he can simply remove and readd the
qdiscs, that's what most users do anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:27 +01:00
Thomas Graf 22b33429ab [PKT_SCHED]: GRED: Use new generic red interface
Simplifies code a lot by separating the red algorithm and the
queueing logic. We now differentiate between probability marks
and forced marks but sum them together again to not break
backwards compatibility.

This brings GRED back to the level of RED and improves the
accuracy of the averge queue length calculations when stab
suggests a zero shift.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:27 +01:00
Thomas Graf f62d6b936d [PKT_SCHED]: GRED: Use central VQ change procedure
Introduces a function gred_change_vq() acting as a central point
to change VQ parameters. Fixes priority inheritance in rio mode
when the default DP equals 0. Adds proper locking during changes.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:27 +01:00
Thomas Graf a8aaa9958e [PKT_SCHED]: GRED: Report out-of-bound DPs as illegal
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:26 +01:00
Thomas Graf 6639607ed9 [PKT_SCHED]: GRED: Use a central table definition change procedure
Introduces a function gred_change_table_def() acting as a central
point to change the table definition.

Adds missing validations for table definition: MAX_DPs > DPs > 0
and def_DP < DPs thus fixing possible invalid memory reference
oopses. Only root could do it but having a typo crashing the
machine is a bit hard.

Adds missing locking while changing the table definition, the
operation of changing the number of DPs and removing shadowed VQs
may not be interrupted by a dequeue.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:26 +01:00
Thomas Graf e06368221c [PKT_SCHED]: GRED: Dump table definition
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:26 +01:00
Thomas Graf 05f1cc01b4 [PKT_SCHED]: GRED: Cleanup dumping
Avoids the allocation of a buffer by appending the VQs directly
to the skb and simplifies the code by using the appropriate
message construction macros.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:26 +01:00
Thomas Graf d6fd4e9667 [PKT_SCHED]: GRED: Transform grio to GRED_RIO_MODE
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:25 +01:00
Thomas Graf dea3f62852 [PKT_SCHED]: GRED: Cleanup equalize flag and add new WRED mode detection
Introduces a flags variable using bitops and transforms eqp to use
it. Converts the conditions of the form (wred && rio) to (wred)
since wred can only be enabled in rio mode anyway.

The patch also improves WRED mode detection. The current behaviour
does not allow WRED mode to be turned off again without removing
the whole qdisc first. The new algorithm checks each VQ against
each other looking for equal priorities every time a VQ is changed
or added. The performance is poor, O(n**2), but it's used only
during administrative tasks and the number of VQs is strictly
limited.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:25 +01:00
Thomas Graf dba051f36a [PKT_SCHED]: RED: Cleanup and remove unnecessary code
Removes the skb trimming code which is not needed since we never
touch the skb upon failure. Removes unnecessary includes,
initializers, and simplifies the code a bit. Removes Jamal's
obsolete email addresses upon his own request.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:25 +01:00
Thomas Graf 6a1b63d467 [PKT_SCHED]: RED: Dont start idle periods while already idling
We should not interrupt and restart an idle period while idling already.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:25 +01:00
Thomas Graf 9e178ff27c [PKT_SCHED]: RED: Use generic queue management interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:25 +01:00
Thomas Graf 6b31b28a44 [PKT_SCHED]: RED: Use new generic red interface
Simplifies code a lot by separating the red algorithm and the
queueing logic. We now differentiate between probability marks
and forced marks but sum them together again to not break
backwards compatibility.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:25 +01:00
Stephen Hemminger 07aaa11540 [NETEM]: use PSCHED_LESS
Convert netem to use PSCHED_LESS and warn if requeue fails.
With some of the psched clock sources, the subtraction doesn't
work always work right without wrapping.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 17:03:46 -02:00
Harald Welte 1758ee0ea2 [NETFILTER] nf_queue: Fix Ooops when no queue handler registered
With the new nf_queue generalization in 2.6.14, we've introduced a bug
that causes an oops as soon as a packet is queued but no queue handler
registered.  This patch fixes it.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 16:43:29 -02:00
Harald Welte 433a4d3b54 [NETFILTER]: CONNMARK target needs ip_conntrack
There's a missing dependency from the CONNMARK target to ip_conntrack.

Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 16:39:20 -02:00
Harald Welte 10dfdc69ea [NETFILTER] nfnetlink: Use kzalloc
These is a cleanup patch, kzalloc can be used in a couple of cases

Signed-off-by: Samir Bellabes <sbellabes@mandriva.com>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 16:35:27 -02:00
Harald Welte 0f81eb4db4 [NETFILTER]: Fix double free after netlink_unicast() in ctnetlink
It's not necessary to free skb if netlink_unicast() failed.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 03:28:37 -02:00
Harald Welte d2a7bb7141 [NETFILTER] NAT: Fix module refcount dropping too far
The unknown protocol is used as a fallback when a protocol isn't known.
Hence we cannot handle it failing, so don't set ".me".  It's OK, since we
only grab a reference from within the same module (iptable_nat.ko), so we
never take the module refcount from 0 to 1.

Also, remove the "protocol is NULL" test: it's never NULL.

Signed-off-by: Rusty Rusty <rusty@rustcorp.com.au>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 01:23:34 -02:00
Harald Welte d811552eda [NETFILTER] PPTP helper: Fix endianness bug in GRE key / CallID NAT
This endianness bug slipped through while changing the 'gre.key' field in the
conntrack tuple from 32bit to 16bit.

None of my tests caught the problem, since the linux pptp client always has
'0' as call id / gre key.  Only windows clients actually trigger the bug.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-04 23:19:17 -02:00
Harald Welte 3428c209c6 [NETFILTER] PPTP helper: Fix compilation of conntrack helper without NAT
This patch fixes compilation of the PPTP conntrack helper when NAT is
configured off.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-04 23:02:53 -02:00
Chuck Lever 0bbacc402e NFS,SUNRPC,NLM: fix unused variable warnings when CONFIG_SYSCTL is disabled
Fix some dprintk's so that NLM, NFS client, and RPC client compile
 cleanly if CONFIG_SYSCTL is disabled.

 Test plan:
 Compile kernel with CONFIG_NFS enabled and CONFIG_SYSCTL disabled.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:39:48 -05:00
Chuck Lever c556b75496 SUNRPC: allow sunrpc.o to link when CONFIG_SYSCTL is disabled
The sunrpc module should build properly even when CONFIG_SYSCTL is
 disabled.

 Reported by Jan-Benedict Glaw.

 Test plan:
 Compile kernel with CONFIG_NFS as a module and built-in, and CONFIG_SYSCTL
 enabled and disabled.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:39:45 -05:00
Thomas Graf 52ab4ac258 [PKT_SCHED]: Rework QoS and/or fair queueing configuration
Make "QoS and/or fair queueing" have its own menu, it's too big to be
inlined into "Network options". Remove the obsolete NET_QOS option.
Automatically select NET_CLS if needed. Do the same for NET_ESTIMATOR
but allow it to be selected manually for statistical purposes. Add
comments to separate queueing from classification. Fix dependencies
and ordering of classifiers. Improve descriptions/help texts and
remove outdated pieces.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-03 02:29:06 -02:00
Yan Zheng 979ad66312 [IPV6]: inet6_ifinfo_notify should use RTM_DELLINK in addrconf_ifdown
Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-03 01:03:05 -02:00
Herbert Xu c75d721c76 [NET]: Fix zero-size datagram reception
The recent rewrite of skb_copy_datagram_iovec broke the reception of
zero-size datagrams.  This patch fixes it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-02 22:25:04 -02:00
Stephen Hemminger 450b5b1898 [TCP]: BIC max increment too large
The max growth of BIC TCP is too large. Original code was based on
BIC 1.0 and the default there was 32. Later code (2.6.13) included
compensation for delayed acks, and should have reduced the default
value to 16; since normally TCP gets one ack for every two packets sent.

The current value of 32 makes BIC too aggressive and unfair to other
flows.

Submitted-by: Injong Rhee <rhee@eos.ncsu.edu>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-02 21:24:01 -02:00
Yan Zheng 8713dbf057 [MCAST]: ip[6]_mc_add_src should be called when number of sources is zero
And filter mode is exclude.

Further explanation by David Stevens:

Multicast source filters aren't widely used yet, and that's really the only
feature that's affected if an application actually exercises this bug, as far
as I can tell. An ordinary filter-less multicast join should still work, and
only forwarded multicast traffic making use of filters and doing empty-source
filters with the MSFILTER ioctl would be at risk of not getting multicast
traffic forwarded to them because the reports generated would not be based on
the correct counts.

Signed-off-by: Yan Zheng <yanzheng@21cn.com
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-02 21:03:57 -02:00
Yan Zheng 97300b5fdf [MCAST] IPv6: Check packet size when process Multicast
Signed-off-by: Yan Zheng <yanzheng@21cn.com
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 22:52:03 -02:00
Herbert Xu edc9e81917 [DCCP]: Set socket owner iff packet is not data
Here is a complimentary insurance policy for those feeling a bit insecure.
You don't have to accept this.  However, if you do, you can't blame me for
it :)
  
> 1) dccp_transmit_skb sets the owner for all packets except data packets.
  
We can actually verify this by looking at pkt_type.
  
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 22:30:02 -02:00
Herbert Xu 48918a4dbd [DCCP]: Simplify skb_set_owner_w semantics
While we're at it let's reorganise the set_owner_w calls a little so that:
  
1) dccp_transmit_skb sets the owner for all packets except data packets.
2) Add dccp_skb_entail to set owner for packets queued for retransmission.
3) Make dccp_transmit_skb static.
  
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 19:26:17 -02:00
Yan Zheng 9d17f21893 [IPV6]: Fix behavior of ip6_route_input() for link local address
I find that linux will reply echo request destined to an address which
belongs to an interface other than the one from which the request received.
This behavior doesn't make sense for link local address.

YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> said:

Please note that sender does need to setup neighbor entry by hand to reproduce
this bug.  (Link-local address on eth1 is not visible on eth0, from the point
of view of neighbor discovery in IPv6.)

 +--------+               +--------+
 | sender |               | router |
 +---+----+               +-+----+-+
     |eth0              eth0|    |eth1
-----+----------------------+-  -+--------------

Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Andrew Morton <akpm@osdl.org> (forwarded)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 16:54:05 -02:00
Andrew Morton a3d7a9d775 [ROSE]: rose_heartbeat_expiry() locking fix
Missing unlock, as noted by Ted Unangst <tedu@coverity.com>.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 16:41:45 -02:00
Harald Welte 6b7d31fcdd [NETFILTER]: Add "revision" support to arp_tables and ip6_tables
Like ip_tables already has it for some time, this adds support for
having multiple revisions for each match/target.  We steal one byte from
the name in order to accomodate a 8 bit version number.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 16:36:08 -02:00
Stephen Hemminger 6ede2463c8 [BRIDGE]: Use ether_compare
Use compare_ether_addr in bridge code.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 16:34:10 -02:00
Jean Delvare 3fa63c7d82 [PATCH] Typo fix: dot after newline in printk strings
Typo fix: dots appearing after a newline in printk strings.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:20 -08:00
Herbert Xu 6df5b9f48d [CRYPTO] Simplify one-member scatterlist expressions
This patch rewrites various occurences of &sg[0] where sg is an array
of length one to simply sg.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-30 11:19:43 +11:00
David Hardeman 378f058cc4 [PATCH] Use sg_set_buf/sg_init_one where applicable
This patch uses sg_set_buf/sg_init_one in some places where it was
duplicated.

Signed-off-by: David Hardeman <david@2gen.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-30 11:19:43 +11:00
Al Viro a6e0eb3791 [PATCH] bluetooth hidp is broken on s390
Bluetooth HIDP selects INPUT and it really needs it to be there - module
depends on input core.  And input core is never built on s390...

Marked as broken on s390, for now; if somebody has better ideas, feel
free to fix it and remove dependency...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 10:35:08 -07:00
Jayachandran C 9fcc2e8a75 [IPV4]: Fix issue reported by Coverity in ipv4/fib_frontend.c
fib_del_ifaddr() dereferences ifa->ifa_dev, so the code already assumes that
ifa->ifa_dev is non-NULL, the check is unnecessary.

Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-29 02:53:39 -02:00
Stephen Hemminger 360ac8e2f1 [ETH]: ether address compare
Expose faster ether compare for use by protocols and other
driver. And change name to be more consistent with other ether
address manipulation routines in same file

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-29 02:23:58 -02:00
Arnaldo Carvalho de Melo 974f7bc578 Merge master.kernel.org:/pub/scm/linux/kernel/git/sridhar/lksctp-2.6 2005-10-28 23:35:02 -02:00
Ivan Skytte Jorgensen 64a0c1c81e [SCTP] Do not allow unprivileged programs initiating new associations on
privileged ports.

Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:39:02 -07:00
Ivan Skytte Jorgensen 96a339985d [SCTP] Allow SCTP_MAXSEG to revert to default frag point with a '0' value.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:36:12 -07:00
Ivan Skytte Jorgensen a1ab358269 [SCTP] Fix SCTP_SETADAPTION sockopt to use the correct structure.
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:33:24 -07:00
Ivan Skytte Jorgensen eaa5c54dbe [SCTP] Rename SCTP specific control message flags.
Rename SCTP specific control message flags to use SCTP_ prefix rather than
MSG_ prefix as per the latest sctp sockets API draft.

Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2005-10-28 15:10:00 -07:00
Linus Torvalds 84860bf064 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2005-10-28 13:09:47 -07:00
Yan Zheng f12baeab9d [MCAST] IPv6: Fix algorithm to compute Querier's Query Interval
5.1.3.  Maximum Response Code

   The Maximum Response Code field specifies the maximum time allowed
   before sending a responding Report.  The actual time allowed, called
   the Maximum Response Delay, is represented in units of milliseconds,
   and is derived from the Maximum Response Code as follows:

   If Maximum Response Code < 32768,
      Maximum Response Delay = Maximum Response Code

   If Maximum Response Code >=32768, Maximum Response Code represents a
   floating-point value as follows:

       0 1 2 3 4 5 6 7 8 9 A B C D E F
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1| exp |          mant         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Maximum Response Delay = (mant | 0x1000) << (exp+3)


5.1.9.  QQIC (Querier's Query Interval Code)

   The Querier's Query Interval Code field specifies the [Query
   Interval] used by the Querier.  The actual interval, called the
   Querier's Query Interval (QQI), is represented in units of seconds,
   and is derived from the Querier's Query Interval Code as follows:

   If QQIC < 128, QQI = QQIC

   If QQIC >= 128, QQIC represents a floating-point value as follows:

       0 1 2 3 4 5 6 7
      +-+-+-+-+-+-+-+-+
      |1| exp | mant  |
      +-+-+-+-+-+-+-+-+

   QQI = (mant | 0x10) << (exp + 3)

                                                -- rfc3810

#define MLDV2_QQIC(value) MLDV2_EXP(0x80, 4, 3, value)
#define MLDV2_MRC(value) MLDV2_EXP(0x8000, 12, 3, value)

Above macro are defined in mcast.c. but 1 << 4 == 0x10 and 1 << 12 == 0x1000.
So the result computed by original Macro is larger.

Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-28 16:35:18 -02:00
Ananda Raju e89e9cf539 [IPv4/IPv6]: UFO Scatter-gather approach
Attached is kernel patch for UDP Fragmentation Offload (UFO) feature.

1. This patch incorporate the review comments by Jeff Garzik.
2. Renamed USO as UFO (UDP Fragmentation Offload)
3. udp sendfile support with UFO

This patches uses scatter-gather feature of skb to generate large UDP
datagram. Below is a "how-to" on changes required in network device
driver to use the UFO interface.

UDP Fragmentation Offload (UFO) Interface:
-------------------------------------------
UFO is a feature wherein the Linux kernel network stack will offload the
IP fragmentation functionality of large UDP datagram to hardware. This
will reduce the overhead of stack in fragmenting the large UDP datagram to
MTU sized packets

1) Drivers indicate their capability of UFO using
dev->features |= NETIF_F_UFO | NETIF_F_HW_CSUM | NETIF_F_SG

NETIF_F_HW_CSUM is required for UFO over ipv6.

2) UFO packet will be submitted for transmission using driver xmit routine.
UFO packet will have a non-zero value for

"skb_shinfo(skb)->ufo_size"

skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP
fragment going out of the adapter after IP fragmentation by hardware.

skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[]
contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW
indicating that hardware has to do checksum calculation. Hardware should
compute the UDP checksum of complete datagram and also ip header checksum of
each fragmented IP packet.

For IPV6 the UFO provides the fragment identification-id in
skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating
IPv6 fragments.

Signed-off-by: Ananda Raju <ananda.raju@neterion.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (forwarded)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-28 16:30:00 -02:00
Arnaldo Carvalho de Melo de5144164f Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2005-10-28 15:49:24 -02:00
Marcel Holtmann dd7f5527b3 [Bluetooth] Update security filter for Extended Inquiry Response
This patch updates the HCI security filter with support for the Extended
Inquiry Response (EIR) feature.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-10-28 19:20:53 +02:00
Marcel Holtmann 6516455d3b [Bluetooth] Make more functions static
This patch makes another bunch of functions static.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-10-28 19:20:48 +02:00
Marcel Holtmann 408c1ce271 [Bluetooth] Move CRC table into RFCOMM core
This patch moves rfcomm_crc_table[] into the RFCOMM core, because there
is no need to keep it in a separate file.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-10-28 19:20:36 +02:00
Greg KH 6fbfddcb52 Merge ../bleed-2.6 2005-10-28 10:13:16 -07:00
Dmitry Torokhov 34abf91f40 [PATCH] Input: convert net/bluetooth to dynamic input_dev allocation
Input: convert net/bluetooth to dynamic input_dev allocation

This is required for input_dev sysfs integration

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 09:52:54 -07:00
Linus Torvalds e5dfa9282f Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-10-28 09:05:25 -07:00
Linus Torvalds 236fa08168 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.15 2005-10-28 08:50:37 -07:00
Al Viro 7d877f3bda [PATCH] gfp_t: net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Trond Myklebust 434f1d10c1 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-27 22:13:32 -04:00
Trond Myklebust 6070fe6f82 RPC: Ensure that nobody can queue up new upcalls after rpc_close_pipes()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:46 -04:00
Jeff Garzik b2ab040db8 Merge branch 'master' 2005-10-27 20:35:17 -04:00
Trond Myklebust 4c2cb58c55 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-27 19:12:49 -04:00
Trond Myklebust 6fa05b1736 Revert "RPC: stops the release_pipe() funtion from being called twice"
This reverts 747c5534c9 commit.
2005-10-27 19:08:18 -04:00
Herbert Xu 2ad41065d9 [TCP]: Clear stale pred_flags when snd_wnd changes
This bug is responsible for causing the infamous "Treason uncloaked"
messages that's been popping up everywhere since the printk was added.
It has usually been blamed on foreign operating systems.  However,
some of those reports implicate Linux as both systems are running
Linux or the TCP connection is going across the loopback interface.

In fact, there really is a bug in the Linux TCP header prediction code
that's been there since at least 2.1.8.  This bug was tracked down with
help from Dale Blount.

The effect of this bug ranges from harmless "Treason uncloaked"
messages to hung/aborted TCP connections.  The details of the bug
and fix is as follows.

When snd_wnd is updated, we only update pred_flags if
tcp_fast_path_check succeeds.  When it fails (for example,
when our rcvbuf is used up), we will leave pred_flags with
an out-of-date snd_wnd value.

When the out-of-date pred_flags happens to match the next incoming
packet we will again hit the fast path and use the current snd_wnd
which will be wrong.

In the case of the treason messages, it just happens that the snd_wnd
cached in pred_flags is zero while tp->snd_wnd is non-zero.  Therefore
when a zero-window packet comes in we incorrectly conclude that the
window is non-zero.

In fact if the peer continues to send us zero-window pure ACKs we
will continue making the same mistake.  It's only when the peer
transmits a zero-window packet with data attached that we get a
chance to snap out of it.  This is what triggers the treason
message at the next retransmit timeout.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-27 15:11:04 -02:00
Andrew Morton 4bcde03d41 [PATCH] svcsock timestamp fix
Convert nanoseconds to microseconds correctly.

Spotted by Steve Dickson <SteveD@redhat.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00
Jeff Garzik 35848e048f [PATCH] kill massive wireless-related log spam
Although this message is having the intended effect of causing wireless
driver maintainers to upgrade their code, I never should have merged this
patch in its present form.  Leading to tons of bug reports and unhappy
users.

Some wireless apps poll for statistics regularly, which leads to a printk()
every single time they ask for stats.  That's a little bit _too_ much of a
reminder that the driver is using an old API.

Change this to printing out the message once, per kernel boot.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00
Jeff Garzik 1f57389a38 Merge branch 'master' 2005-10-26 01:06:45 -04:00
James Ketrenos 077783f877 [PATCH] ieee80211 build fix
James Ketrenos wrote:
> [3/4] Use the tx_headroom and reserve requested space.

This patch introduced a compile problem; patch below corrects this.

Fixed compilation error due to not passing tx_headroom in
ieee80211_tx_frame.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-26 00:54:23 -04:00
David Engel dcab5e1eec [IPV4]: Fix setting broadcast for SIOCSIFNETMASK
Fix setting of the broadcast address when the netmask is set via
SIOCSIFNETMASK in Linux 2.6.  The code wanted the old value of
ifa->ifa_mask but used it after it had already been overwritten with
the new value.

Signed-off-by: David Engel <gigem@comcast.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:20:21 -02:00
Ralf Baechle 95df1c04ab [AX.25]: Use constant instead of magic number
Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:14:09 -02:00
Randy Dunlap c83c248618 [SK_BUFF] kernel-doc: fix skbuff warnings
Add kernel-doc to skbuff.h, skbuff.c to eliminate kernel-doc warnings.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:10:18 -02:00
Jayachandran C 0d0d2bba97 [IPV4]: Remove dead code from ip_output.c
skb_prev is assigned from skb, which cannot be NULL. This patch removes the
unnecessary NULL check.

Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:58:54 -02:00
Jayachandran C ea7ce40649 [NETLINK]: Remove dead code in af_netlink.c
Remove the variable nlk & call to nlk_sk as it does not have any side effect.

Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:54:46 -02:00
Herbert Xu 80b30c1023 [IPSEC]: Kill obsolete get_mss function
Now that we've switched over to storing MTUs in the xfrm_dst entries,
we no longer need the dst's get_mss methods.  This patch gets rid of
them.

It also documents the fact that our MTU calculation is not optimal
for ESP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:48:45 -02:00
Herbert Xu 1371e37da2 [IPV4]: Kill redundant rcu_dereference on fa_info
This patch kills a redundant rcu_dereference on fa->fa_info in fib_trie.c.
As this dereference directly follows a list_for_each_entry_rcu line, we
have already taken a read barrier with respect to getting an entry from
the list.

This read barrier guarantees that all values read out of fa are valid.
In particular, the contents of structure pointed to by fa->fa_info is
initialised before fa->fa_info is actually set (see fn_trie_insert);
the setting of fa->fa_info itself is further separated with a write
barrier from the insertion of fa into the list.

Therefore by taking a read barrier after obtaining fa from the list
(which is given by list_for_each_entry_rcu), we can be sure that
fa->fa_info contains a valid pointer, as well as the fact that the
data pointed to by fa->fa_info is itself valid.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:25:03 -02:00
Harald Welte eed75f191d [NETFILTER] ip_conntrack: Make "hashsize" conntrack parameter writable
It's fairly simple to resize the hash table, but currently you need to
remove and reinsert the module.  That's bad (we lose connection
state).  Harald has even offered to write a daemon which sets this
based on load.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:19:27 -02:00
Stephen Hemminger d50a6b56f0 [PKTGEN]: proc interface revision
The code to handle the /proc interface can be cleaned up in several places:
* use seq_file for read
* don't need to remember all the filenames separately
* use for_online_cpu's
* don't vmalloc a buffer for small command from user.

Committer note:
This patch clashed with John Hawkes's "[NET]: Wider use of for_each_*cpu()",
so I fixed it up manually.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:12:18 -02:00
Stephen Hemminger b4099fab75 [PKTGEN]: Spelling and white space
Fix some cosmetic issues. Indentation, spelling errors, and some whitespace.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:08:10 -02:00
Stephen Hemminger 2845b63b50 [PKTGEN]: Use kzalloc
These are cleanup patches for pktgen that can go in 2.6.15
Can use kzalloc in a couple of places.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:05:32 -02:00
Stephen Hemminger b7c8921bf1 [PKTGEN]: Sleeping function called under lock
pktgen is calling kmalloc GFP_KERNEL and vmalloc with lock held.
The simplest fix is to turn the lock into a semaphore, since the
thread lock is only used for admin control from user context.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:03:12 -02:00
John Hawkes 670c02c2bf [NET]: Wider use of for_each_*cpu()
In 'net' change the explicit use of for-loops and NR_CPUS into the
general for_each_cpu() or for_each_online_cpu() constructs, as
appropriate.  This widens the scope of potential future optimizations
of the general constructs, as well as takes advantage of the existing
optimizations of first_cpu() and next_cpu(), which is advantageous
when the true CPU count is much smaller than NR_CPUS.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 23:54:01 -02:00
Patrick Caulfield 900e0143a5 [DECNET]: Remove some redundant ifdeffed code
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 23:49:29 -02:00
Jochen Friedrich 5ac660ee13 [TR]: Preserve RIF flag even for 2 byte RIF fields.
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 21:31:38 -02:00
Yan Zheng 4ea6a8046b [IPV6]: Fix refcnt of struct ip6_flowlabel
Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 21:17:52 -02:00
Herbert Xu 49636bb128 [NEIGH] Fix timer leak in neigh_changeaddr
neigh_changeaddr attempts to delete neighbour timers without setting
nud_state.  This doesn't work because the timer may have already fired
when we acquire the write lock in neigh_changeaddr.  The result is that
the timer may keep firing for quite a while until the entry reaches
NEIGH_FAILED.

It should be setting the nud_state straight away so that if the timer
has already fired it can simply exit once we relinquish the lock.

In fact, this whole function is simply duplicating the logic in
neigh_ifdown which in turn is already doing the right thing when
it comes to deleting timers and setting nud_state.

So all we have to do is take that code out and put it into a common
function and make both neigh_changeaddr and neigh_ifdown call it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 17:18:00 +10:00
Herbert Xu 6fb9974f49 [NEIGH] Fix add_timer race in neigh_add_timer
neigh_add_timer cannot use add_timer unconditionally.  The reason is that
by the time it has obtained the write lock someone else (e.g., neigh_update)
could have already added a new timer.

So it should only use mod_timer and deal with its return value accordingly.

This bug would have led to rare neighbour cache entry leaks.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 16:37:48 +10:00
Herbert Xu 203755029e [NEIGH] Print stack trace in neigh_add_timer
Stack traces are very helpful in determining the exact nature of a bug.
So let's print a stack trace when the timer is added twice.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 16:11:39 +10:00
Julian Anastasov c98d80edc8 [SK_BUFF]: ipvs_property field must be copied
IPVS used flag NFC_IPVS_PROPERTY in nfcache but as now nfcache was removed the
new flag 'ipvs_property' still needs to be copied. This patch should be
included in 2.6.14.

Further comments from Harald Welte:

Sorry, seems like the bug was introduced by me.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-22 17:06:01 -02:00
Michael Buesch d3f7bf4fa9 ieee80211 subsystem:
* Use GFP mask on TX skb allocation.
* Use the tx_headroom and reserve requested space.

Signed-off-by: Michael Buesch <mbuesch@freenet.de>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-21 13:00:28 -05:00
Herbert Xu b2cc99f04c [TCP] Allow len == skb->len in tcp_fragment
It is legitimate to call tcp_fragment with len == skb->len since
that is done for FIN packets and the FIN flag counts as one byte.
So we should only check for the len > skb->len case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 17:13:13 -02:00
Herbert Xu 49c5bfaffe [DCCP]: Clear the IPCB area
Turns out the problem has nothing to do with use-after-free or double-free.
It's just that we're not clearing the CB area and DCCP unlike TCP uses a CB
format that's incompatible with IP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 14:49:59 -02:00
Herbert Xu ffa29347df [DCCP]: Make dccp_write_xmit always free the packet
icmp_send doesn't use skb->sk at all so even if skb->sk has already
been freed it can't cause crash there (it would've crashed somewhere
else first, e.g., ip_queue_xmit).

I found a double-free on an skb that could explain this though.
dccp_sendmsg and dccp_write_xmit are a little confused as to what
should free the packet when something goes wrong.  Sometimes they
both go for the ball and end up in each other's way.

This patch makes dccp_write_xmit always free the packet no matter
what.  This makes sense since dccp_transmit_skb which in turn comes
from the fact that ip_queue_xmit always frees the packet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 14:44:29 -02:00
Herbert Xu fda0fd6c5b [DCCP]: Use skb_set_owner_w in dccp_transmit_skb when skb->sk is NULL
David S. Miller <davem@davemloft.net> wrote:
> One thing you can probably do for this bug is to mark data packets
> explicitly somehow, perhaps in the SKB control block DCCP already
> uses for other data.  Put some boolean in there, set it true for
> data packets.  Then change the test in dccp_transmit_skb() as
> appropriate to test the boolean flag instead of "skb_cloned(skb)".

I agree.  In fact we already have that flag, it's called skb->sk.
So here is patch to test that instead of skb_cloned().

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 14:25:28 -02:00
Hong Liu f0f15ab554 Fixed oops if an uninitialized key is used for encryption.
Without this patch, if you try and use a key that has not been
configured, for example:

% iwconfig eth1 key deadbeef00 [2]

without having configured key [1], then the active key will still be
[1], but privacy will now be enabled.  Transmission of a packet in this
situation will result in a kernel oops.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-20 11:06:36 -05:00
Hong Liu 5b74eda78d Fixed problem with not being able to decrypt/encrypt broadcast packets.
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-19 16:49:03 -05:00
J. Bruce Fields a0857d03b2 RPCSEC_GSS: krb5 cleanup
Remove some senseless wrappers.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:47 -07:00
J. Bruce Fields 00fd6e1425 RPCSEC_GSS remove all qop parameters
Not only are the qop parameters that are passed around throughout the gssapi
 unused by any currently implemented mechanism, but there appears to be some
 doubt as to whether they will ever be used.  Let's just kill them off for now.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:47 -07:00
J. Bruce Fields 14ae162c24 RPCSEC_GSS: Add support for privacy to krb5 rpcsec_gss mechanism.
Add support for privacy to the krb5 rpcsec_gss mechanism.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:46 -07:00
J. Bruce Fields bfa91516b5 RPCSEC_GSS: krb5 pre-privacy cleanup
The code this was originally derived from processed wrap and mic tokens using
 the same functions.  This required some contortions, and more would be required
 with the addition of xdr_buf's, so it's better to separate out the two code
 paths.

 In preparation for adding privacy support, remove the last vestiges of the
 old wrap token code.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:45 -07:00
J. Bruce Fields f7b3af64c6 RPCSEC_GSS: Simplify rpcsec_gss crypto code
Factor out some code that will be shared by privacy crypto routines

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:45 -07:00
J. Bruce Fields 2d2da60c63 RPCSEC_GSS: client-side privacy support
Add the code to the client side to handle privacy.  This is dead code until
 we actually add privacy support to krb5.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:44 -07:00
J. Bruce Fields 24b2605bec RPCSEC_GSS: cleanup au_rslack calculation
Various xdr encode routines use au_rslack to guess where the reply argument
 will end up, so we can set up the xdr_buf to recieve data into the right place
 for zero copy.

 Currently we calculate the au_rslack estimate when we check the verifier.
 Normally this only depends on the verifier size.  In the integrity case we add
 a few bytes to allow for a length and sequence number.

 It's a bit simpler to calculate only the verifier size when we check the
 verifier, and delay the full calculation till we unwrap.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:44 -07:00
J. Bruce Fields f3680312a7 SUNRPC: Retry wrap in case of memory allocation failure.
For privacy we need to allocate extra pages to hold encrypted page data when
 wrapping requests.  This allocation may fail, and we handle that case by
 waiting and retrying.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:43 -07:00
J. Bruce Fields ead5e1c26f SUNRPC: Provide a callback to allow free pages allocated during xdr encoding
For privacy, we need to allocate pages to store the encrypted data (passed
 in pages can't be used without the risk of corrupting data in the page cache).
 So we need a way to free that memory after the request has been transmitted.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:43 -07:00
J. Bruce Fields 293f1eb551 SUNRPC: Add support for privacy to generic gss-api code.
Add support for privacy to generic gss-api code.  This is dead code until we
 have both a mechanism that supports privacy and code in the client or server
 that uses it.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:42 -07:00
Steve Dickson 747c5534c9 RPC: stops the release_pipe() funtion from being called twice
This patch stops the release_pipe() funtion from being called
 twice by invalidating the ops pointer in the rpc_inode
 when rpc_pipe_release() is called.

 Signed-off-by: Steve Dickson <steved@redhat.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:40 -07:00
Jiri Benc 757d18faee [PATCH] ieee80211: division by zero fix
This fixes division by zero bug in ieee80211_wx_get_scan().

Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-18 17:25:36 -04:00
Trond Myklebust 5e5ce5be6f RPC: allow call_encode() to delay transmission of an RPC call.
Currently, call_encode will cause the entire RPC call to abort if it returns
 an error. This is unnecessarily rigid, and gets in the way of attempts
 to allow the NFSv4 layer to order RPC calls that carry sequence ids.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:11 -07:00
Chuck Lever ea635a517e SUNRPC: Retry rpcbind requests if the server's portmapper isn't up
After a server crash/reboot, rebinding should always retry, otherwise
 requests on "hard" mounts will fail when they shouldn't.

 Test plan:
 Run a lock-intensive workload against a server while rebooting the server
 repeatedly.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:10 -07:00
Jeff Garzik 28af493cd7 Merge branch 'master' 2005-10-18 17:14:17 -04:00
Trond Myklebust cff6bf9709 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-18 13:50:52 -07:00
Andrew Morton e6850cce8f [NETFILTER]: Fix ip6_table.c build with NETFILTER_DEBUG enabled.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-15 16:15:38 -07:00
Jeff Garzik 59aee3c2a1 Merge branch 'master' 2005-10-13 21:22:27 -04:00
Herbert Xu 046d20b739 [TCP]: Ratelimit debugging warning.
Better safe than sorry.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 14:42:24 -07:00
Andi Kleen 34cb711ba9 [NET]: Disable NET_SCH_CLK_CPU for SMP x86 hosts
Opterons with frequency scaling have fully unsynchronized TSCs
running at different frequencies, so using TSCs there is not a good idea. 
Also some other x86 boxes have this problem. gettimeofday should be good 
enough, so just disable it.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 14:41:44 -07:00
David S. Miller c8923c6b85 [NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering.
Original patch by Harald Welte, with feedback from Herbert Xu
and testing by Sbastien Bernard.

EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus
are numbered linearly.  That is not necessarily true.

This patch fixes that up by calculating the largest possible
cpu number, and allocating enough per-cpu structure space given
that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 14:41:23 -07:00
Herbert Xu 9ff5c59ce2 [TCP]: Add code to help track down "BUG at net/ipv4/tcp_output.c:438!"
This is the second report of this bug.  Unfortunately the first
reporter hasn't been able to reproduce it since to provide more
debugging info.

So let's apply this patch for 2.6.14 to

1) Make this non-fatal.
2) Provide the info we need to track it down.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-12 15:59:39 -07:00
Stephen Hemminger ab4060e858 [BRIDGE]: fix race on bridge del if
This fixes the RCU race on bridge delete interface.  Basically,
the network device has to be detached from the bridge in the first
step (pre-RCU), rather than later. At that point, no more bridge traffic
will come in, and the other code will not think that network device
is part of a bridge.

This should also fix the XEN test problems.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-12 15:10:01 -07:00
Arnaldo Carvalho de Melo eeb2b85606 [TWSK]: Grab the module refcount for timewait sockets
This is required to avoid unloading a module that has active timewait
sockets, such as DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:25:23 -07:00
Arnaldo Carvalho de Melo 2a9bc9bb4d [DCCP]: Transition from PARTOPEN to OPEN when receiving DATA packets
Noticed by Andrea Bittau, that provided a patch that was modified to
not transition from RESPOND to OPEN when receiving DATA packets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:25:00 -07:00
Arnaldo Carvalho de Melo 777b25a2fe [CCID]: Check if ccid is NULL in the hc_[tr]x_exit functions
For consistency with ccid_exit and to fix a bug when
IP_DCCP_UNLOAD_HACK is enabled as the control sock is not associated
to any CCID.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:24:20 -07:00
Pablo Neira Ayuso 061cb4a0ec [NETFILTER] ctnetlink: add support to change protocol info
This patch add support to change the state of the private protocol
information via conntrack_netlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:23:46 -07:00
Pablo Neira Ayuso 3392315375 [NETFILTER] ctnetlink: allow userspace to change TCP state
This patch adds the ability of changing the state a TCP connection. I know
that this must be used with care but it's required to provide a complete
conntrack creation via conntrack_netlink. So I'll document this aspect on
the upcoming docs.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:23:28 -07:00
Harald Welte a051a8f730 [NETFILTER]: Use only 32bit counters for CONNTRACK_ACCT
Initially we used 64bit counters for conntrack-based accounting, since we
had no event mechanism to tell userspace that our counters are about to
overflow.  With nfnetlink_conntrack, we now have such a event mechanism and
thus can save 16bytes per connection.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:21:10 -07:00
Herbert Xu d4875b049b [IPSEC] Fix block size/MTU bugs in ESP
This patch fixes the following bugs in ESP:

* Fix transport mode MTU overestimate.  This means that the inner MTU
  is smaller than it needs be.  Worse yet, given an input MTU which
  is a multiple of 4 it will always produce an estimate which is not
  a multiple of 4.

  For example, given a standard ESP/3DES/MD5 transform and an MTU of
  1500, the resulting MTU for transport mode is 1462 when it should
  be 1464.

  The reason for this is because IP header lengths are always a multiple
  of 4 for IPv4 and 8 for IPv6.

* Ensure that the block size is at least 4.  This is required by RFC2406
  and corresponds to what the esp_output function does.  At the moment
  this only affects crypto_null as its block size is 1.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:11:34 -07:00
Herbert Xu a02a64223e [IPSEC]: Use ALIGN macro in ESP
This patch uses the macro ALIGN in all the applicable spots for ESP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:11:08 -07:00
Pablo Neira Ayuso e1c73b78e3 [NETFILTER] ctnetlink: add one nesting level for TCP state
To keep consistency, the TCP private protocol information is nested
attributes under CTA_PROTOINFO_TCP. This way the sequence of attributes to
access the TCP state information looks like here below:

CTA_PROTOINFO
CTA_PROTOINFO_TCP
CTA_PROTOINFO_TCP_STATE

instead of:

CTA_PROTOINFO
CTA_PROTOINFO_TCP_STATE

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:55:49 -07:00
Pablo Neira Ayuso a1bcc3f268 [NETFILTER] ctnetlink: ICMP ID is not mandatory
The ID is only required by ICMP type 8 (echo), so it's not
mandatory for all sort of ICMP connections. This patch makes
mandatory only the type and the code for ICMP netlink messages.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:53:16 -07:00
Harald Welte d000eaf772 [NETFILTER] conntrack_netlink: Fix endian issue with status from userspace
When we send "status" from userspace, we forget to convert the endianness.
This patch adds the reqired conversion.  Thanks to Pablo Neira for
discovering this.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:52:51 -07:00
Harald Welte ebe0bbf06c [NETFILTER] nfnetlink: use highest bit of nfa_type to indicate nested TLV
As Henrik Nordstrom pointed out, all our efforts with "split endian" (i.e.
host byte order tags, net byte order values) are useless, unless a parser
can determine whether an attribute is nested or not.

This patch steals the highest bit of nfattr.nfa_type to indicate whether
the data payload contains a nested nfattr (1) or not (0).

This will break userspace compatibility, but luckily no kernel with
nfnetlink was released so far.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:52:19 -07:00
Harald Welte f40863cec8 [NETFILTER] ipt_ULOG: Mark ipt_ULOG as OBSOLETE
Similar to nfnetlink_queue and ip_queue, we mark ipt_ULOG as obsolete.
This should have been part of the original nfnetlink_log merge, but
I somehow missed it.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:51:53 -07:00
Harald Welte 85d9b05d9b [NETFILTER] PPTP helper: Add missing Kconfig dependency
PPTP should not be selectable without conntrack enabled

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:47:42 -07:00
Al Viro dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Jean-Denis Boyer 4f55cd105c [ATM]: [br2684] if we free the skb, we should return 0
From: "Jean-Denis Boyer" <jdboyer@mediatrix.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-07 13:44:35 -07:00
Eric Kinzie 0f21ba7cc3 [ATM]: add support for LECS addresses learned from network
From: Eric Kinzie <ekinzie@cmf.nrl.navy.mil>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-06 22:19:28 -07:00
Ivan Skytte Jrgensen 5fe467ee97 [SCTP] Fix sctp_get{pl}addrs() API to work with 32-bit apps on 64-bit kernels.
The old socket options are marked with a _OLD suffix so that the
existing 32-bit apps on 32-bit kernels do not break.

Signed-off-by: Ivan Skytte Jrgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-06 21:36:17 -07:00
Ralf Baechle 3a867b36c3 [AX.25]: Fix packet socket crash
Since changeset 98a82febb6 AX.25 is passing
received IP and ARP packets to the stack through netif_rx() but we don't
set the skb->mac.raw to right value which may result in a crash with
applications that use a packet socket.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:16:04 -07:00
Herbert Xu 77d8d7a684 [IPSEC]: Document that policy direction is derived from the index.
Here is a patch that adds a helper called xfrm_policy_id2dir to
document the fact that the policy direction can be and is derived
from the index.

This is based on a patch by YOSHIFUJI Hideaki and 210313105@suda.edu.cn.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:15:12 -07:00
YOSHIFUJI Hideaki 140e26fcd5 [IPV6]: Fix NS handing for proxy/anycast address
Timer set up by pneigh_enqueue() ended up calling ndisc_rcv()
via pndisc_redo(), which clears LOCALLY_ENQUEUED flag in
NEIGH_CB(skb) and NS was queued again.
Let's call ndisc_recv_ns() directly to avoid the loop.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:11:41 -07:00
Stephen Hemminger 42a39450f8 [TCP]: BIC coding bug in Linux 2.6.13
Missing parenthesis in causes BIC to be slow in increasing congestion
window.

Spotted by Injong Rhee.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:09:31 -07:00
Yan Zheng fab10fe37a [MCAST] ipv6: Fix address size in grec_size
Signed-Off-By: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:08:13 -07:00
Jeff Garzik 0d69ae5fb7 Merge branch 'master' 2005-10-05 02:11:33 -04:00
Randy Dunlap 83fa3400eb [XFRM]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in xfrm code:
net/xfrm/xfrm_policy.c:232:47: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:45:35 -07:00
Randy Dunlap dd13a285b7 [RPC]: fix sparse gfp nocast warnings
Fix nocast sparse warnings:
net/rxrpc/call.c:2013:25: warning: implicit cast to nocast type
net/rxrpc/connection.c:538:46: warning: implicit cast to nocast type
net/sunrpc/sched.c:730:36: warning: implicit cast to nocast type
net/sunrpc/sched.c:734:56: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:44:45 -07:00
Randy Dunlap 00fa023345 [AF_KEY]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in net/key code:
net/key/af_key.c:195:27: warning: implicit cast to nocast type
net/key/af_key.c:1439:28: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:43:04 -07:00
Randy Dunlap c6f4fafccf [NETFILTER]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in nfnetlink code:
net/netfilter/nfnetlink.c:204:43: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:42:42 -07:00
Randy Dunlap 8eea00a44d [IPVS]: fix sparse gfp nocast warnings
From: Randy Dunlap <rdunlap@xenotime.net>

Fix implicit nocast warnings in ip_vs code:
net/ipv4/ipvs/ip_vs_app.c:631:54: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:42:15 -07:00
Randy Dunlap f4a19a56e3 [DECNET]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in decnet code:
net/decnet/af_decnet.c:458:40: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:125:35: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:219:29: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:41:48 -07:00
Randy Dunlap 7b5b3f3d82 [ATM]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in atm code:
net/atm/atm_misc.c:35:44: warning: implicit cast to nocast type
drivers/atm/fore200e.c:183:33: warning: implicit cast to nocast type

Also use kzalloc() instead of kmalloc().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:38:44 -07:00
Horst H. von Brand a5181ab06d [NETFILTER]: Fix Kconfig typo
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 15:58:56 -07:00
Robert Olsson e6308be85a [IPV4]: fib_trie root-node expansion
The patch below introduces special thresholds to keep root node in the trie 
large. This gives a flatter tree at the cost of a modest memory increase.
Overall it seems to be gain and this was also proposed by one the authors 
of the paper in recent a seminar.

Main table after loading 123 k routes.

	Aver depth:     3.30
	Max depth:      9
        Root-node size  12 bits
        Total size: 4044  kB

With the patch:
	Aver depth:     2.78
	Max depth:      8
        Root-node size  15 bits
        Total size: 4150  kB

An increase of 8-10% was seen in forwading performance for an rDoS attack. 

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 13:01:58 -07:00
YOSHIFUJI Hideaki 87bf9c97b4 [IPV6]: Fix infinite loop in udp_v6_get_port().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 13:00:39 -07:00
Jeff Garzik 13d1ef29bc Merge rsync://bughost.org/repos/ieee80211-delta/ 2005-10-04 08:22:13 -04:00
Jeff Garzik d9e34325fd Merge branch 'upstream-fixes' 2005-10-04 05:30:02 -04:00
Randy Dunlap f36a29d567 [PATCH] ieee80211: fix gfp flags type
Fix implicit nocast warnings in ieee80211 code, including __nocast:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-04 05:29:48 -04:00
Jeff Garzik 3c8c7b2f32 Merge branch 'upstream-fixes' 2005-10-03 22:06:19 -04:00
Randy Dunlap 8cb6108bae [PATCH] ieee80211: fix gfp flags type
Fix implicit nocast warnings in ieee80211 code:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-03 22:01:14 -04:00
David S. Miller 7ce312467e [IPV4]: Update icmp sysctl docs and disable broadcast ECHO/TIMESTAMP by default
It's not a good idea to be smurf'able by default.
The few people who need this can turn it on.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 16:07:30 -07:00
Herbert Xu 3e56a40bb3 [IPV4]: Get rid of bogus __in_put_dev in pktgen
This patch gets rid of a bogus __in_dev_put() in pktgen.c.  This was
spotted by Suzanne Wood.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:36:32 -07:00
Herbert Xu e5ed639913 [IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl
The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.

1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().

There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition.  I've marked it as such so that we remember to fix it.

This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:35:55 -07:00
David S. Miller a5e7c210fe [IPV6]: Fix leak added by udp connect dst caching fix.
Based upon a patch from Mitsuru KANDA <mk@linux-ipv6.org>

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:21:58 -07:00
Yan Zheng f36d6ab182 [IPV6]: Fix ipv6 fragment ID selection at slow path
Signed-Off-By: Yan Zheng <yanzheng@21cn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:19:15 -07:00
Herbert Xu 444fc8fc3a [IPV4]: Fix "Proxy ARP seems broken"
Meelis Roos <mroos@linux.ee> wrote:
> RK> My firewall setup relies on proxyarp working.  However, with 2.6.14-rc3,
> RK> it appears to be completely broken.  The firewall is 212.18.232.186,
> 
> Same here with some kernel between 14-rc2 and 14-rc3 - no reposnse to
> ARP on a proxyarp gateway. Sorry, no exact revison and no more debugging
> yet since it'a a production gateway.

The breakage is caused by the change to use the CB area for flagging
whether a packet has been queued due to proxy_delay.  This area gets
cleared every time arp_rcv gets called.  Unfortunately packets delayed
due to proxy_delay also go through arp_rcv when they are reprocessed.

In fact, I can't think of a reason why delayed proxy packets should go
through netfilter again at all.  So the easiest solution is to bypass
that and go straight to arp_process.

This is essentially what would've happened before netfilter support
was added to ARP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:18:10 -07:00
Russell King 496a22b08f [NET]: Fix "sysctl_net.c:36: error: 'core_table' undeclared here"
During the build for ARM machine type "fortunet", this error occurred:

  CC      net/sysctl_net.o
net/sysctl_net.c:36: error: 'core_table' undeclared here (not in a function)

It appears that the following configuration settings cause this error
due to a missing include:
CONFIG_SYSCTL=y
CONFIG_NET=y
# CONFIG_INET is not set

core_table appears to be declared in net/sock.h.  if CONFIG_INET were
defined, net/sock.h would have been included via:
  sysctl_net.c -> net/ip.h -> linux/ip.h -> net/sock.h

so include it directly.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:16:34 -07:00
Eric Dumazet 81c3d5470e [INET]: speedup inet (tcp/dccp) lookups
Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)

(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

1) First some performance data :
--------------------------------

tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

The most time critical code is :

sk_for_each(sk, node, &head->chain) {
     if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
         goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.

As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.

This can be problematic if some chains are very long.

2) The goal
-----------

The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of the patch
---------------------------

Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.

struct sock_common {
	unsigned short		skc_family;
	volatile unsigned char	skc_state;
	unsigned char		skc_reuse;
	int			skc_bound_dev_if;
	struct hlist_node	skc_node;
	struct hlist_node	skc_bind_node;
	atomic_t		skc_refcnt;
+	unsigned int		skc_hash;
	struct proto		*skc_prot;
};

Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.

Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)

File include/net/inet_hashtables.h

64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))
     ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie))   &&  \
     ((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)->sk_hash == (__hash))                 &&  \
     (inet_sk(__sk)->daddr          == (__saddr))   &&  \
     (inet_sk(__sk)->rcv_saddr      == (__daddr))   &&  \
     (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))


- Adds a prefetch(head->chain.first) in 
__inet_lookup_established()/__tcp_v4_check_established() and 
__inet6_lookup_established()/__tcp_v6_check_established() and 
__dccp_v4_check_established() to bring into cache the first element of the 
list, before the {read|write}_lock(&head->lock);

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:13:38 -07:00
Herbert Xu 325ed82393 [NET]: Fix packet timestamping.
I've found the problem in general.  It affects any 64-bit
architecture.  The problem occurs when you change the system time.

Suppose that when you boot your system clock is forward by a day.
This gets recorded down in skb_tv_base.  You then wind the clock back
by a day.  From that point onwards the offset will be negative which
essentially overflows the 32-bit variables they're stored in.

In fact, why don't we just store the real time stamp in those 32-bit
variables? After all, we're not going to overflow for quite a while
yet.

When we do overflow, we'll need a better solution of course.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 13:57:23 -07:00
James Ketrenos ff0037b259 Lindent and trailing whitespace script executed ieee80211 subsystem
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-03 10:23:42 -05:00
Ivo van Doorn c1bda44a4a When an assoc_resp is received the network structure is not completely
initialized which can cause problems for drivers that expect the network
structure to be completely filled in.

This patch will make sure the network is filled in as much as possible.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-03 10:20:47 -05:00
Ivo van Doorn ff9e00f1b0 Currently the info_element is parsed by 2 seperate functions, this
results in a lot of duplicate code.

This will move the parsing stage into a seperate function.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-03 10:19:25 -05:00
Randy Dunlap e846cbb112 Fix implicit nocast warnings in ieee80211 code:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-03 10:02:14 -05:00
Ivo van Doorn 7c254d3dba This will move the ieee80211_is_ofdm_rate function to the ieee80211.h
header, and I also added the ieee80211_is_cck_rate counterpart.

Various drivers currently create there own version of these functions,
but I guess the ieee80211 stack is the best place to provide such
routines.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-03 09:50:40 -05:00
Scott Talbert 75b895c15b [ATM]: [lec] reset retry counter when new arp issued
From: Scott Talbert <scott.talbert@lmco.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29 17:31:30 -07:00
Scott Talbert 4a7097fcc4 [ATM]: [lec] attempt to support cisco failover
From: Scott Talbert <scott.talbert@lmco.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29 17:30:54 -07:00
Alexey Kuznetsov 09e9ec8711 [TCP]: Don't over-clamp window in tcp_clamp_window()
From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>

Handle better the case where the sender sends full sized
frames initially, then moves to a mode where it trickles
out small amounts of data at a time.

This known problem is even mentioned in the comments
above tcp_grow_window() in tcp_input.c, specifically:

...
 * The scheme does not work when sender sends good segments opening
 * window and then starts to feed us spagetti. But it should work
 * in common situations. Otherwise, we have to rely on queue collapsing.
...

When the sender gives full sized frames, the "struct sk_buff" overhead
from each packet is small.  So we'll advertize a larger window.
If the sender moves to a mode where small segments are sent, this
ratio becomes tilted to the other extreme and we start overrunning
the socket buffer space.

tcp_clamp_window() tries to address this, but it's clamping of
tp->window_clamp is a wee bit too aggressive for this particular case.

Fix confirmed by Ion Badulescu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29 17:17:15 -07:00
David S. Miller 01ff367e62 [TCP]: Revert 6b251858d3
But retain the comment fix.

Alexey Kuznetsov has explained the situation as follows:

--------------------

I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.

See?

So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:

	init_rcv_wnd(mss)=
	  min { init_cwnd(mss1)*mss1 for mss1 <= mss }

It is something sort of:

	if (mss < 1096)
		return mss*4;
	if (mss < 1096*2)
		return 1096*4;
	return mss*2;

(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)

I selected it differently giving more window than it is strictly
required.  Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst.  But besides that it is arbitrary, so I decided to give
slack space of one segment.

Actually, the logic was:

If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.

For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.

If mss is huge, force 2*mss. No slack space.

Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.

--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29 17:07:20 -07:00
Linus Torvalds eb693d2994 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-29 08:56:47 -07:00
Al Viro 666002218d [PATCH] proc_mkdir() should be used to create procfs directories
A bunch of create_proc_dir_entry() calls creating directories had crept
in since the last sweep; converted to proc_mkdir().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29 08:46:26 -07:00
David S. Miller 01d40f28b1 [NET]: Fix reversed logic in eth_type_trans().
I got the second compare_eth_addr() test reversed, oops.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 22:37:53 -07:00
Martin Whitaker 735631a919 [ATM]: fix bug in atm address list handling
From: Martin Whitaker <atm@martin-whitaker.co.uk>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2005-09-28 16:35:22 -07:00
Chas Williams 9301e320e9 [ATM]: track and close listen sockets when sigd exits
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2005-09-28 16:35:01 -07:00
Roman Kagan e2c4b72158 [ATM]: net/atm/ioctl.c: autoload pppoatm and br2684
Signed-off-by: Roman Kagan <rkagan@mail.ru>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2005-09-28 16:34:24 -07:00
David S. Miller 6b251858d3 [TCP]: Fix init_cwnd calculations in tcp_select_initial_window()
Match it up to what RFC2414 really specifies.
Noticed by Rick Jones.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 16:31:48 -07:00
Oliver Dawid 64233bffbb [APPLETALK]: Fix broadcast bug.
From: Oliver Dawid <oliver@helios.de>

we found a bug in net/appletalk/ddp.c concerning broadcast packets. In 
kernel 2.4 it was working fine. The bug first occured 4 years ago when 
switching to new SNAP layer handling. This bug can be splitted up into a 
sending(1) and reception(2) problem:

Sending(1)
In kernel 2.4 broadcast packets were sent to a matching ethernet device 
and atalk_rcv() was called to receive it as "loopback" (so loopback 
packets were shortcutted and handled in DDP layer).

When switching to the new SNAP structure, this shortcut was removed and 
the loopback packet was send to SNAP layer. The author forgot to replace 
the remote device pointer by the loopback device pointer before sending 
the packet to SNAP layer (by calling ddp_dl->request() ) therfor the 
packet was not sent back by underlying layers to ddp's atalk_rcv().

Reception(2)
In atalk_rcv() a packet received by this loopback mechanism contains now 
the (rigth) loopback device pointer (in Kernel 2.4 it was the (wrong) 
remote ethernet device pointer) and therefor no matching socket will be 
found to deliver this packet to. Because a broadcast packet should be 
send to the first matching socket (as it is done in many other protocols 
(?)), we removed the network comparison in broadcast case.

Below you will find a patch to correct this bug. Its diffed to kernel 
2.6.14-rc1

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 16:11:29 -07:00
David S. Miller ba645c1602 [NET]: Slightly optimize ethernet address comparison.
We know the thing is at least 2-byte aligned, so take
advantage of that instead of invoking memcmp() which
results in truly horrifically inefficient code because
it can't assume anything about alignment.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 16:03:05 -07:00
Alexey Dobriyan 520d1b830a [ROSE]: fix typo (regeistration)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:45:15 -07:00
Alexey Dobriyan a83cd2cc90 [ROSE]: check rose_ndevs earlier
* Don't bother with proto registering if rose_ndevs is bad.
* Make escape structure more coherent.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:44:36 -07:00
Alexey Dobriyan 70ff3b66d7 [ROSE]: return sane -E* from rose_proto_init()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:43:46 -07:00
Alexey Dobriyan c3c4ed652e [ROSE]: do proto_unregister() on exit paths
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:42:58 -07:00
Frank Filz a79af59efd [NET]: Fix module reference counts for loadable protocol modules
I have been experimenting with loadable protocol modules, and ran into
several issues with module reference counting.

The first issue was that __module_get failed at the BUG_ON check at
the top of the routine (checking that my module reference count was
not zero) when I created the first socket. When sk_alloc() is called,
my module reference count was still 0. When I looked at why sctp
didn't have this problem, I discovered that sctp creates a control
socket during module init (when the module ref count is not 0), which
keeps the reference count non-zero. This section has been updated to
address the point Stephen raised about checking the return value of
try_module_get().

The next problem arose when my socket init routine returned an error.
This resulted in my module reference count being decremented below 0.
My socket ops->release routine was also being called. The issue here
is that sock_release() calls the ops->release routine and decrements
the ref count if sock->ops is not NULL. Since the socket probably
didn't get correctly initialized, this should not be done, so we will
set sock->ops to NULL because we will not call try_module_get().

While searching for another bug, I also noticed that sys_accept() has
a possibility of doing a module_put() when it did not do an
__module_get so I re-ordered the call to security_socket_accept().

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:23:38 -07:00
Eric Dumazet 2d7ceece08 [NET]: Prefetch dev->qdisc_lock in dev_queue_xmit()
We know the lock is going to be taken.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:22:58 -07:00
Daniel Phillips bc8dfcb939 [NET]: Use non-recursive algorithm in skb_copy_datagram_iovec()
Use iteration instead of recursion.  Fraglists within fraglists
should never occur, so we BUG check this.

Signed-off-by: Daniel Phillips <phillips@istop.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:22:35 -07:00
David S. Miller 667347f1ca [NEIGH]: Add debugging check when adding timers.
If we double-add a neighbour entry timer, which should be
impossible but has been reported, dump the current state of
the entry so that we can debug this.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 12:07:44 -07:00
David S. Miller 56e9b26324 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/llc-2.6 2005-09-26 15:29:31 -07:00
Harald Welte 188bab3ae0 [NETFILTER]: Fix invalid module autoloading by splitting iptable_nat
When you've enabled conntrack and NAT as a module (standard case in all
distributions), and you've also enabled the new conntrack netlink
interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko.
This causes a huge performance penalty, since for every packet you iterate
the nat code, even if you don't want it.

This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the
iptables frontend (iptable_nat.ko).  Threfore, ip_conntrack_netlink.ko will
only pull ip_nat.ko, but not the frontend.  ip_nat.ko will "only" allocate
some resources, but not affect runtime performance.

This separation is also a nice step in anticipation of new packet filters
(nf-hipac, ipset, pkttables) being able to use the NAT core.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26 15:25:11 -07:00
David S. Miller b85daee0e4 [AF_PACKET]: Remove bogus checks added to packet_sendmsg().
These broke existing apps, and the checks are superfluous
as the values being verified aren't even used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26 15:23:58 -07:00
Herbert Xu c62dba9011 [IPV6]: Fix [Bug 5306] Oops on IPv6 route lookup
> Steps to reproduce:
> 1. Boot Linux, do NOT setup any IPv6 routes
> 2. ip route get 2001::1 (or any unroutable address)

Well caught.  We never set rt6i_idev on ip6_null_entry.
This patch should make the problem go away.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26 15:10:16 -07:00
Alex Williamson b9d717a7b4 [NET]: Make sure ctl buffer is aligned properly in sys_sendmsg().
It's on the stack and declared as "unsigned char[]", but pointers
and similar can be in here thus we need to give it an explicit
alignment attribute.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26 14:28:02 -07:00
Harald Welte 8ddec7460d [NETFILTER] ip_conntrack: Update event cache when status changes
The GRE, SCTP and TCP protocol helpers did not call
ip_conntrack_event_cache() when updating ct->status.  This patch adds
the respective calls.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 16:56:08 -07:00
Alexey Dobriyan 8689c07e47 [IRDA]: *irttp cleanup
* Remove useless comment.
* Remove useless assertions.
* Remove useless comparison.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 16:55:17 -07:00
Alexey Dobriyan 15166fadb0 [IRDA]: Fix memory leak in irttp_init()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 16:54:50 -07:00
Amos Waterland 45fc3b11f1 [NET]: Protect neigh_stat_seq_fops by CONFIG_PROC_FS
From: Amos Waterland <apw@us.ibm.com>

If CONFIG_PROC_FS is not selected, the compiler emits this warning:

 net/core/neighbour.c:64: warning: `neigh_stat_seq_fops' defined but not used

Which is correct, because neigh_stat_seq_fops is in fact only
initialized and used by code that is protected by CONFIG_PROC_FS.  So
this patch fixes that up.

Signed-off-by: Amos Waterland <apw@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 16:53:16 -07:00
Harald Welte d67b24c40f [NETFILTER]: Fix ip[6]t_NFQUEUE Kconfig dependency
We have to introduce a separate Kconfig menu entry for the NFQUEUE targets.
They cannot "just" depend on nfnetlink_queue, since nfnetlink_queue could
be linked into the kernel, whereas iptables can be a module.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 16:52:03 -07:00
Chuck Lever 6cd7525a00 SUNRPC: fix bug in patch "portmapper doesn't need a reserved port"
The in-kernel portmapper does in fact need a reserved port when registering
 new services, but not when performing bind queries.

 Ensure that we distinguish between the two cases.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 13:54:10 -04:00
Trond Myklebust f134585a73 Revert "[PATCH] RPC,NFS: new rpc_pipefs patch"
This reverts 17f4e6febca160a9f9dd4bdece9784577a2f4524 commit.
2005-09-23 12:39:00 -04:00
Christoph Hellwig 278c995c8a [PATCH] RPC,NFS: new rpc_pipefs patch
Currently rpc_mkdir/rpc_rmdir and rpc_mkpipe/mk_unlink have an API that's
 a little unfortunate.  They take a path relative to the rpc_pipefs root and
 thus need to perform a full lookup.  If you look at debugfs or usbfs they
 always store the dentry for directories they created and thus can pass in
 a dentry + single pathname component pair into their equivalents of the
 above functions.

 And in fact rpc_pipefs actually stores a dentry for all but one component so
 this change not only simplifies the core rpc_pipe code but also the callers.

 Unfortuntately this code path is only used by the NFS4 idmapper and
 AUTH_GSSAPI for which I don't have a test enviroment.  Could someone give
 it a spin?  It's the last bit needed before we can rework the
 lookup_hash API

 Signed-off-by: Christoph Hellwig <hch@lst.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:57 -04:00
Chuck Lever 470056c288 [PATCH] RPC: rationalize set_buffer_size
In fact, ->set_buffer_size should be completely functionless for non-UDP.

 Test-plan:
 Check socket buffer size on UDP sockets over time.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:55 -04:00
Chuck Lever 03bf4b707e [PATCH] RPC: parametrize various transport connect timeouts
Each transport implementation can now set unique bind, connect,
 reestablishment, and idle timeout values.  These are variables,
 allowing the values to be modified dynamically.  This permits
 exponential backoff of any of these values, for instance.

 As an example, we implement exponential backoff for the connection
 reestablishment timeout.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:53 -04:00
Chuck Lever 3167e12c0c [PATCH] RPC: make sure to get the same local port number when reconnecting
Implement a best practice: if the remote end drops our connection, try to
 reconnect using the same port number.  This is important because the NFS
 server's Duplicate Reply Cache often hashes on the source port number.
 If the client reuses the port number when it reconnects, the server's DRC
 will be more effective.

 Based on suggestions by Mike Eisler, Olaf Kirch, and Alexey Kuznetsky.

 Test-plan:
 Destructive testing.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:52 -04:00
Chuck Lever 529b33c6db [PATCH] RPC: allow RPC client's port range to be adjustable
Select an RPC client source port between 650 and 1023 instead of between
 1 and 800.  The old range conflicts with a number of network services.
 Provide sysctls to allow admins to select a different port range.

 Note that this doesn't affect user-level RPC library behavior, which
 still uses 1 to 800.

 Based on a suggestion by Olaf Kirch <okir@suse.de>.

 Test-plan:
 Repeated mount and unmount.  Destructive testing.  Idle timeouts.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:50 -04:00
Chuck Lever 555ee3af16 [PATCH] RPC: clean up after nocong was removed
Clean-up:  Move some macros that are specific to the Van Jacobson
 implementation into xprt.c.  Get rid of the cong_wait field in
 rpc_xprt, which is no longer used.  Get rid of xprt_clear_backlog.

 Test-plan:
 Compile with CONFIG_NFS enabled.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:48 -04:00
Chuck Lever ed63c00370 [PATCH] RPC: remove xprt->nocong
Get rid of the "xprt->nocong" variable.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss with UDP mounts.
 Look for significant regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:47 -04:00
Chuck Lever a58dd398f5 [PATCH] RPC: add a release_rqst callout to the RPC transport switch
The final place where congestion control state is adjusted is in
 xprt_release, where each request is finally released.  Add a callout
 there to allow transports to perform additional processing when a
 request is about to be released.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss.  Look for significant
 regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:45 -04:00
Chuck Lever 1570c1e41e [PATCH] RPC: add generic interface for adjusting the congestion window
A new interface that allows transports to adjust their congestion window
 using the Van Jacobson implementation in xprt.c is provided.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss.  Look for
 significant regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:43 -04:00
Chuck Lever 46c0ee8bc4 [PATCH] RPC: separate xprt_timer implementations
Allow transports to hook the retransmit timer interrupt.  Some transports
 calculate their congestion window here so that a retransmit timeout has
 immediate effect on the congestion window.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss.  Look for significant
 regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:41 -04:00
Chuck Lever 49e9a89086 [PATCH] RPC: expose API for serializing access to RPC transports
The next method we abstract is the one that releases a transport,
 allowing another task to have access to the transport.

 Again, one generic version of this is provided for transports that
 don't need the RPC client to perform congestion control, and one
 version is for transports that can use the original Van Jacobson
 implementation in xprt.c.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss.  Look for
 significant regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:40 -04:00
Chuck Lever 12a804698b [PATCH] RPC: expose API for serializing access to RPC transports
The next several patches introduce an API that allows transports to
 choose whether the RPC client provides congestion control or whether
 the transport itself provides it.

 The first method we abstract is the one that serializes access to the
 RPC transport to prevent the bytes from different requests from mingling
 together.  This method provides proper request serialization and the
 opportunity to prevent new requests from being started because the
 transport is congested.

 The normal situation is for the transport to handle congestion control
 itself.  Although NFS over UDP was first, it has been recognized after
 years of experience that having the transport provide congestion control
 is much better than doing it in the RPC client.  Thus TCP, and probably
 every future transport implementation, will use the default method,
 xprt_lock_write, provided in xprt.c, which does not provide any kind
 of congestion control.  UDP can continue using the xprt.c-provided
 Van Jacobson congestion avoidance implementation.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss.  Look for significant
 regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:38 -04:00
Chuck Lever fe3aca290f [PATCH] RPC: add API to set transport-specific timeouts
Prepare the way to remove the "xprt->nocong" variable by adding a callout
 to the RPC client transport switch API to handle setting RPC retransmit
 timeouts.

 Add a pair of generic helper functions that provide the ability to set a
 simple fixed timeout, or to set a timeout based on the state of a round-
 trip estimator.

 Test-plan:
 Use WAN simulation to cause sporadic bursty packet loss.  Look for significant
 regression in performance or client stability.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:36 -04:00
Chuck Lever 43118c29de [PATCH] RPC: get rid of xprt->stream
Now we can fix up the last few places that use the "xprt->stream"
 variable, and get rid of it from the rpc_xprt structure.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:35 -04:00
Chuck Lever 808012fbb2 [PATCH] RPC: skip over transport-specific heads automatically
Add a generic mechanism for skipping over transport-specific headers
 when constructing an RPC request.  This removes another "xprt->stream"
 dependency.

 Test-plan:
 Write-intensive workload on a single mount point (try both UDP and
 TCP).

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:33 -04:00
Chuck Lever 262965f53d [PATCH] RPC: separate TCP and UDP socket write paths
Split the RPC client's main socket write path into a TCP version and a UDP
 version to eliminate another dependency on the "xprt->stream" variable.

 Compiler optimization removes unneeded code from xs_sendpages, as this
 function is now called with some constant arguments.

 We can now cleanly perform transport protocol-specific return code testing
 and error recovery in each path.

 Test-plan:
 Millions of fsx operations.  Performance characterization such as
 "sio" or "iozone".  Examine oprofile results for any changes before and
 after this patch is applied.

 Version: Thu, 11 Aug 2005 16:08:46 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:31 -04:00
Chuck Lever b0d93ad511 [PATCH] RPC: separate TCP and UDP transport connection logic
Create separate connection worker functions for managing UDP and TCP
 transport sockets.  This eliminates several dependencies on "xprt->stream".

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon with
 v2, v3, and v4.

 Version: Thu, 11 Aug 2005 16:08:18 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:29 -04:00
Chuck Lever c7b2cae8a6 [PATCH] RPC: separate TCP and UDP write space callbacks
Split the socket write space callback function into a TCP version and UDP
 version, eliminating one dependence on the "xprt->stream" variable.

 Keep the common pieces of this path in xprt.c so other transports can use
 it too.

 Test-plan:
 Write-intensive workload on a single mount point.

 Version: Thu, 11 Aug 2005 16:07:51 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:28 -04:00
Chuck Lever 55aa4f58aa [PATCH] RPC: client-side transport switch cleanup
Clean-up: change some comments to reflect the realities of the new RPC
 transport switch mechanism.  Get rid of unused xprt_receive() prototype.

 Also, organize function prototypes in xprt.h by usage and scope.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:07:21 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:26 -04:00
Chuck Lever 44fbac2288 [PATCH] RPC: Add helper for waking tasks pending on a transport
Clean-up: remove only reference to xprt->pending from the socket transport
 implementation.  This makes a cleaner interface for other transport
 implementations as well.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:06:52 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:24 -04:00
Chuck Lever 86b9f57dfd [PATCH] RPC: Eliminate socket.h includes in RPC client
Clean-up: get rid of unnecessary socket.h and in.h includes in the generic
 parts of the RPC client.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:06:23 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:23 -04:00
Chuck Lever 2226feb6bc [PATCH] RPC: rename the sockstate field
Clean-up: get rid of a name reference to sockets in the generic parts of the
 RPC client by renaming the sockstate field in the rpc_xprt structure.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:05:53 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:21 -04:00
Chuck Lever 5dc07727f8 [PATCH] RPC: Rename xprt_lock
Clean-up: Replace the xprt_lock with something more aptly named.  This lock
 single-threads the XID and request slot reservation process.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:05:26 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:19 -04:00
Chuck Lever 4a0f8c04f2 [PATCH] RPC: Rename sock_lock
Clean-up: replace a name reference to sockets in the generic parts of the RPC
 client by renaming sock_lock in the rpc_xprt structure.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:05:00 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:17 -04:00
Chuck Lever b4b5cc85ed [PATCH] RPC: Reduce stack utilization in xs_sendpages
Reduce stack utilization of the RPC socket transport's send path.

 A couple of unlikely()s are added to ensure the compiler places the
 tail processing at the end of the csect.

 Test-plan:
 Millions of fsx operations.  Performance characterization such as "sio" or
 "iozone".

 Version: Thu, 11 Aug 2005 16:04:30 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:16 -04:00
Chuck Lever 9903cd1c27 [PATCH] RPC: transport switch function naming
Introduce block header comments and a function naming convention to the
 socket transport implementation.  Provide a debug setting for transports
 that is separate from RPCDBG_XPRT.  Eliminate xprt_default_timeout().

 Provide block comments for exposed interfaces in xprt.c, and eliminate
 the useless obvious comments.

 Convert printk's to dprintk's.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.

 Version: Thu, 11 Aug 2005 16:04:04 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:14 -04:00
Chuck Lever a246b0105b [PATCH] RPC: introduce client-side transport switch
Move the bulk of client-side socket-specific code into a separate source
 file, net/sunrpc/xprtsock.c.

 Test-plan:
 Millions of fsx operations.  Performance characterization such as "sio" or
 "iozone".  Destructive testing (unplugging the network temporarily, server
 reboots).  Connectathon with v2, v3, and v4.

 Version: Thu, 11 Aug 2005 16:03:38 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:12 -04:00
Chuck Lever 094bb20b9f [PATCH] RPC: extract socket logic common to both client and server
Clean-up: Move some code that is common to both RPC client- and server-side
 socket transports into its own source file, net/sunrpc/socklib.c.

 Test-plan:
 Compile kernel with CONFIG_NFS enabled.  Millions of fsx operations over
 UDP, client and server.  Connectathon over UDP.

 Version: Thu, 11 Aug 2005 16:03:09 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:11 -04:00
Chuck Lever 602f83273c [PATCH] RPC: portmapper doesn't need a reserved port
The in-kernel portmapper does not require a reserved port for making
 bind queries.

 Test-plan:
 Tens of runs of the Connectathon locking suite with TCP and UDP
 against several other NFS server implementations using NFSv3,
 not NFSv4 (which doesn't require rpcbind).

 Version: Thu, 11 Aug 2005 16:02:43 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:08 -04:00
Chuck Lever eab5c084b8 [PATCH] NFS: use a constant value for TCP retransmit timeouts
Implement a best practice: don't use exponential backoff when computing
 retransmit timeout values on TCP connections, but simply retransmit
 at regular intervals.

 This also fixes a bug introduced when xprt_reset_majortimeo() was added.

 Test-plan:
 Enable RPC debugging and watch timeout behavior on a NFS/TCP mount.

 Version: Thu, 11 Aug 2005 16:02:19 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:06 -04:00
Chuck Lever da35187801 [PATCH] RPC: proper soft timeout behavior for rpcbind
Implement a best practice:  for soft mounts, an rpcbind timeout should
 cause an RPC request to fail.

 This also provides an FSM hook for retrying an rpcbind with a different
 rpcbind protocol version.  We'll use this later to try multiple rpcbind
 protocol versions when binding.  To enable this, expose the RPC error
 code returned during a portmap request to the FSM so it can make some
 decision about how to report, retry, or fail the request.

 Test-plan:
 Hundreds of passes with connectathon NFSv3 locking suite, on the client
 and server.

 Version: Thu, 11 Aug 2005 16:01:53 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:04 -04:00
Chuck Lever 23475d66bd [PATCH] RPC: Report connection errors properly when mounting with "soft"
Fix up xprt_connect_status: the soft timeout logic was clobbering tk_status,
 so TCP connect errors were not properly reported on soft mounts.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.

 Version: Thu, 11 Aug 2005 16:01:28 -0400

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23 12:38:03 -04:00
Sridhar Samudrala eb0e007687 [SCTP]: Fix SCTP_SHUTDOWN notifications.
Fix to allow SCTP_SHUTDOWN notifications to be received on 1-1 style
SCTP SOCK_STREAM sockets.

Add SCTP_SHUTDOWN notification to the receive queue before updating
the state of the association.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 23:48:38 -07:00
Harald Welte 1dfbab5949 [NETFILTER] Fix conntrack event cache deadlock/oops
This patch fixes a number of bugs.  It cannot be reasonably split up in
multiple fixes, since all bugs interact with each other and affect the same
function:

Bug #1:
The event cache code cannot be called while a lock is held.  Therefore, the
call to ip_conntrack_event_cache() within ip_ct_refresh_acct() needs to be
moved outside of the locked section.  This fixes a number of 2.6.14-rcX
oops and deadlock reports.

Bug #2:
We used to call ct_add_counters() for unconfirmed connections without
holding a lock.  Since the add operations are not atomic, we could race
with another CPU.

Bug #3:
ip_ct_refresh_acct() lost REFRESH events in some cases where refresh
(and the corresponding event) are desired, but no accounting shall be
performed.  Both, evenst and accounting implicitly depended on the skb
parameter bein non-null.   We now re-introduce a non-accounting
"ip_ct_refresh()" variant to explicitly state the desired behaviour.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 23:46:57 -07:00
Alexey Dobriyan 67497205b1 [NETFILTER] Fix sparse endian warnings in pptp helper
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 23:45:24 -07:00
Harald Welte 0ae5d253ad [NETFILTER] fix DEBUG statement in PPTP helper
As noted by Alexey Dobriyan, the DEBUGP statement prints the wrong
callID.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 23:44:58 -07:00
Vlad Drukker 2a7bc3c94c [BRIDGE]: TSO fix in br_dev_queue_push_xmit
Signed-off-by: Vlad Drukker <vlad@storewiz.com>
Acked-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 23:35:34 -07:00
Herbert Xu 83ca28befc [TCP]: Adjust Reno SACK estimate in tcp_fragment
Since the introduction of TSO pcount a year ago, it has been possible
for tcp_fragment() to cause packets_out to decrease.  Prior to that,
tcp_retrans_try_collapse() was the only way for that to happen on the
retransmission path.

When this happens with Reno, it is possible for sasked_out to become
invalid because it is only an estimate and not tied to any particular
packet on the retransmission queue.

Therefore we need to adjust sacked_out as well as left_out in the Reno
case.  The following patch does exactly that.

This bug is pretty difficult to trigger in practice though since you
need a SACKless peer with a retransmission that occurs just as the
cached MTU value expires.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 23:32:56 -07:00
James Ketrenos 6eb6edf04a [PATCH] ieee80211: in-tree driver updates to sync with latest ieee80211 series
Changed crypto method from requiring a struct ieee80211_device reference
to the init handler.  Instead we now have a get/set flags method for
each crypto component.

Setting of TKIP countermeasures can now be done via
set_flags(IEEE80211_CRYPTO_TKIP_COUNTERMEASURES)

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-22 15:40:59 -04:00
James Ketrenos 31b59eaee8 [PATCH] ieee80211: Added handle_deauth() callback, enhanced tkip/ccmp support of varying hw/sw offload
tree de81b55e78e85997642c651ea677078d0554a14f
parent c8030da8c159f8b82712172a6748a42523aea83a
author James Ketrenos <jketreno@linux.intel.com> 1127104380 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127315225 -0500

Added handle_deauth() callback.
Enhanced crypt_{tkip,ccmp} to support varying splits of HW/SW offload.
Changed channel freq to u32 from u16.
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-22 15:39:41 -04:00
James Ketrenos 31696160c7 [PATCH] ieee80211: Added subsystem version string and reporting via MODULE_VERSION
tree c1b50ac5d2d1f9b727c39c6bd86a7872f25a1127
parent 1bb997a3ac7dd1941e02426d2f70bd28993a82b7
author James Ketrenos <jketreno@linux.intel.com> 1126720779 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127314674 -0500

Added subsystem version string and reporting via MODULE_VERSION and
pritnk during load.

NOTE:  This is the version support split out from patch 24/29 of the
prior series.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-22 15:39:41 -04:00
Arnaldo Carvalho de Melo 8420e1b541 [LLC]: fix llc_ui_recvmsg, making it behave like tcp_recvmsg
In fact it is an exact copy of the parts that makes sense to LLC :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 08:29:08 -03:00
Arnaldo Carvalho de Melo d389424e00 [LLC]: Fix the accept path
Borrowing the structure of TCP/IP for this. On the receive of new connections I
was bh_lock_socking the _new_ sock, not the listening one, duh, now it survives
the ssh connections storm I've been using to test this specific bug.

Also fixes send side skb sock accounting.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 07:57:21 -03:00
Arnaldo Carvalho de Melo 2928c19e10 [LLC]: Fix sparse warnings
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 05:14:33 -03:00
Jochen Friedrich 0519d8fbab [TR]: Set correct frame type for SNAP packets
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:51:56 -03:00
Jochen Friedrich 096f0eb1df [LLC]: Fix llc_fixup_skb() bug
llc_fixup_skb() had a bug dropping 3 bytes packets (like UA frames). Token ring
doesn't pad these frames.

Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:48:46 -03:00
Jochen Friedrich 5564af21ae [LLC]: Fix for Bugzilla ticket #5157
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:46:44 -03:00
Jochen Friedrich cf309e3fb8 [LLC]: Fix for Bugzilla ticket #5156
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:44:55 -03:00
Arnaldo Carvalho de Melo 6e2144b768 [LLC]: Use refcounting with struct llc_sap
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:43:05 -03:00
Arnaldo Carvalho de Melo 04e4223f44 [LLC]: Do better struct sock accounting on skbs
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:40:59 -03:00
Arnaldo Carvalho de Melo afdbe35787 [LLC]: Use sk_wait_data
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:37:07 -03:00
Arnaldo Carvalho de Melo 249ff1c6d3 [LLC]: Use some more likely/unlikely
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:32:10 -03:00
Arnaldo Carvalho de Melo 590232a715 [LLC]: Add sysctl support for the LLC timeouts
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:30:44 -03:00
Arnaldo Carvalho de Melo 54fb7f25f1 [LLC]: Use the sk_wait_event primitive
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:26:14 -03:00
Arnaldo Carvalho de Melo b35bd11019 [LLC]: Convert llc_ui_wait_for_ functions to use prepare_to_wait/finish_wait
And make it look more like the similar routines in the TCP/IP source code.

Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:22:39 -03:00
Arnaldo Carvalho de Melo 72b1ad4a7e [LLC]: Remove unused functions from llc_c_ev.c
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:19:52 -03:00
Arnaldo Carvalho de Melo b9441fc337 [LLC]: Use const in llc_c_ev.c
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:09:45 -03:00
Arnaldo Carvalho de Melo af426d327c [LLC]: Help the compiler with likely/unlikely, saving some more bytes
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:59:22 -03:00
Arnaldo Carvalho de Melo 0eb8017242 [LLC]: Mark llc_find_next_offset as __init, saving some more bytes
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:57:55 -03:00
Arnaldo Carvalho de Melo 5a770c0262 [LLC]: Update comments for llc_ui_bind and llc_ui_autobind to match new behaviour
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:56:26 -03:00
Arnaldo Carvalho de Melo 774ccb4f64 [LLC]: Remove unneeded temp net_device variables
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:53:35 -03:00
Arnaldo Carvalho de Melo e0dd55190f [LLC]: introduce llc_conn_tmr_common_cb, to avoid code duplication
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:50:15 -03:00
Arnaldo Carvalho de Melo 838a75dae0 [LLC]: Remove unneeded f_bit variables
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:44:23 -03:00
Arnaldo Carvalho de Melo bdcc66cca8 [LLC]: Simplify llc_c_ac code, removing unneeded assignments to variables
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:38:15 -03:00
Arnaldo Carvalho de Melo 1d67e6501b [LLC]: Make llc_frame_alloc take a net_device as an argument
So as to set the newly created sk_buff ->dev member with it, that way we stop
using dev_base->next, that is the wrong thing to do, as there may well be
several interfaces being used with LLC. This was not such a big problem after
all as most of the users of llc_alloc_frame were setting the correct dev, but
this way code is reduced.

This also fixes another bug in llc_station_ac_send_null_dsap_xid_c, that was
not setting the skb->dev field.

Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 03:27:56 -03:00
James Ketrenos 9a01c16bd4 [PATCH] ieee82011: Remove WIRELESS_EXT ifdefs
Remove old WIRELESS_EXT version compatibility

In-tree doesn't need to maintain backward compatibility.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:19:09 -04:00
James Ketrenos ebeaddcc02 [PATCH] ieee80211: Updated copyright dates
tree 0d3e41e574fcb41b9da7f0b7e1d27ec350726654
parent dbe2885fe2f454d538eaaabefc741ded1026f476
author James Ketrenos <jketreno@linux.intel.com> 1126720499 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127314531 -0500

Updated copyright dates.

NOTE:  This is a split out of just the copyright updates from patch
24/29 in the prior series.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:04:58 -04:00
James Ketrenos 7dc888fefc [PATCH] ieee80211: Keep auth mode unchanged after iwconfig key off/on cycle
tree 2e6f6e7dc4f4eeb8e3dc265020016dd53e40578a
parent ba2075794a089430b3dd7c90ff46ce1b67e9c7cc
author Zhu Yi <yi.zhu@intel.com> 1125551043 +0800
committer James Ketrenos <jketreno@linux.intel.com> 1127314475 -0500

[Bug 768] Keep auth mode unchanged after iwconfig key off/on cycle.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:04:57 -04:00
James Ketrenos ccd0fda3a6 [PATCH] ieee80211: Mixed PTK/GTK CCMP/TKIP support
tree 5c7559a1216ae1121487f6aed94a6017490729b3
parent c1ff4c22e5622c8987bf96c09158c4924cde98c2
author Hong Liu <hong.liu@intel.com> 1125482767 +0800
committer James Ketrenos <jketreno@linux.intel.com> 1127314427 -0500

Mixed PTK/GTK CCMP/TKIP support.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:04:57 -04:00
James Ketrenos 42c94e43be [PATCH] ieee80211: Type-o, capbility definition for QoS, and ERP parsing
tree 3ac0dd07b9972dfd68fee47ec2152d3d378de000
parent 9ada1d971d9829c34a14d98840080b7e69fdff6b
author Mohamed Abbad <mohamed.abbas@intel.com> 1126054379 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127314340 -0500

Type-o, capbility definition for QoS, and ERP parsing

Added WLAN_CAPABILITY_QOS
Fixed type-o WLAN_CAPABILITY_OSSS_OFDM -> WLAN_CAPABILITY_DSSS_OFDM
Added ERP IE parsing to ieee80211_rx
Added handle_probe_request callback.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:04:57 -04:00
James Ketrenos 02cda6ae01 [PATCH] ieee80211: Added ieee80211_geo to provide helper functions
tree 385b391fc0d7c124cd0547fdb6183e9a0c333391
parent 97d7a47f76e72bedde7f402785559ed4c7a8e8e8
author James Ketrenos <jketreno@linux.intel.com> 1124447590 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127313735 -0500

Added ieee80211_geo to provide helper functions to drivers for
implementing supported channel maps.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:03:55 -04:00
James Ketrenos 9e8571affd [PATCH] ieee80211: Add QoS (WME) support to the ieee80211 subsystem
tree a3ad796273e98036eb0e9fc063225070fa24508a
parent 1b9c0aeb377abf8e4a43a86cff42382f74ca0259
author Mohamed Abbas <mabbas@linux.intel.com> 1124447069 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127313435 -0500

Add QoS (WME) support to the ieee80211 subsystem.

NOTE: This requires drivers that use the ieee80211 hard_start_xmit
(ipw2100 and ipw2200) to add the priority parameter to their callback.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:03:54 -04:00
James Ketrenos 2c0aa2a5c2 [PATCH] ieee80211: Return NETDEV_TX_BUSY when QoS buffer full
tree ba6509c7cd1dd4244a2f285f2da5d632e7ffbb25
parent 7b5f9f2ddcabdaea214527a895e6e8445cafdd80
author James Ketrenos <jketreno@linux.intel.com> 1124447000 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127313383 -0500

Per the conversations with folks at OLS, the QoS layer in 802.11
drivers can now result in NETDEV_TX_BUSY being returned when the queue
a packet is targetted for is full.

To implement this, ieee80211_xmit will now call the driver's
is_queue_full to determine if the current priority queue is full.  If
so, NETDEV_TX_BUSY is returned to the kernel and no processing is done
on the frame.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:03:54 -04:00
James Ketrenos 1264fc0498 [PATCH] ieee80211: Fix TKIP, repeated fragmentation problem, and payload_size reporting
tree 8428e9f510e6ad6c77baec89cb57374842abf733
parent d78bfd3ddae9c422dd350159110f9c4d7cfc50de
author Liu Hong <hong.liu@intel.com> 1124446520 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127313183 -0500

Fix TKIP, repeated fragmentation problem, and payload_size reporting

1. TKIP encryption
    Originally, TKIP encryption issues msdu + mpdu encryption on every
    fragment. Change the behavior to msdu encryption on the whole
    packet, then mpdu encryption on every fragment.

2. Avoid repeated fragmentation when !host_encrypt.
    We only need do fragmentation when using host encryption. Otherwise
    we only need pass the whole packet to driver, letting driver do the
    fragmentation.

3. change the txb->payload_size to correct value
    FW will use this value to determine whether to do fragmentation. If
    we pass the wrong value, fw may cut on the wrong bound which will
    make decryption fail when we do host encryption.

NOTE:  This requires changing drivers (hostap) that have
extra_prefix_len used within them (structure member name change).

Signed-off-by: Hong Liu <liu.hong@intel.com>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:02:31 -04:00
James Ketrenos 3f552bbf86 [PATCH] ieee82011: Added ieee80211_tx_frame to convert generic 802.11 data frames, and callbacks
tree 40adc78b623ae70d56074934ec6334eb4f0ae6a5
parent db43d847bcebaa3df6414e26d0008eb21690e8cf
author James Ketrenos <jketreno@linux.intel.com> 1124445938 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127313102 -0500

Added ieee80211_tx_frame to convert generic 802.11 data frames into
txbs for transmission.

Added several purpose specific callbacks (handle_assoc, handle_auth,
etc.) which the driver can register with for being notified on
reception of variouf frame elements.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:02:31 -04:00
James Ketrenos 3cdd00c582 [PATCH] ieee80211: adds support for the creation of RTS packets
tree b45c9c1017fd23216bfbe71e441aed9aa297fc84
parent 04aacdd71e904656a304d923bdcf57ad3bd2b254
author Ivo van Doorn <IvDoorn@gmail.com> 1124445405 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127313029 -0500

This patch adds support for the creation of RTS packets when the
config flag CFG_IEEE80211_RTS has been set.

Signed-Off-By: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:02:30 -04:00
James Ketrenos ee34af37c0 [PATCH] ieee80211: Renamed ieee80211_hdr to ieee80211_hdr_3addr
tree e9c18b2c8e5ad446a4d213243c2dcf9fd1652a7b
parent 4e97ad6ae7084a4f741e94e76c41c68bc7c5a76a
author James Ketrenos <jketreno@linux.intel.com> 1124444315 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127312922 -0500

Renamed ieee80211_hdr to ieee80211_hdr_3addr and modified ieee80211_hdr
to just contain the frame_ctrl and duration_id.

Changed uses of ieee80211_hdr to ieee80211_hdr_4addr or
ieee80211_hdr_3addr based on what was expected for that portion of code.

NOTE: This requires changes to ipw2100, ipw2200, hostap, and atmel
drivers.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:02:30 -04:00
James Ketrenos e0d369d1d9 [PATCH] ieee82011: Added WE-18 support to default wireless extension handler
tree 1536f39c18756698d033da72c49300a561be1289
parent 07172d7c9f10ee3d05d6f6489ba6d6ee2628da06
author Liu Hong <hong.liu@intel.com> 1124436225 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127312664 -0500

Added WE-18 support to default wireless extension handler in ieee80211
subsystem.

Updated patch since last send to account for ieee80211_device parameter
being added to the crypto init method.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:02:30 -04:00
James Ketrenos 259bf1fd8a [PATCH] ieee80211: Allow drivers to fix an issue when using wpa_supplicant with WEP
tree 898fedef6ca1b5b58b8bdf7e6d8894a78bbde4cd
parent 8720fff53090ae428d2159332b6f4b2749dea10f
author Zhu Yi <jketreno@io.(none)> 1124435746 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127312509 -0500

Allow drivers to fix an issue when using wpa_supplicant with WEP.

The problem is introduced by the hwcrypto patch. We changed indicator of
the encryption request from the upper layer (i.e. wpa_supplicant):

In the original host based crypto the driver could use: crypt &&
crypt->ops.

In the new hardware based crypto, the driver should use the flags
specified in ieee->sec.encrypt.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:01:52 -04:00
James Ketrenos 0ad0c3c644 [PATCH] ieee80211: Fix kernel Oops when module unload
tree b69e983266840983183a00f5ac02c66d5270ca47
parent cdd6372949b76694622ed74fe36e1dd17a92eb71
author Zhu Yi <jketreno@io.(none)> 1124435425 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127312421 -0500

Fix kernel Oops when module unload.

Export a new function ieee80211_crypt_quiescing from ieee80211. Device
drivers call it to make the host crypto stack enter the quiescence
state, which means "process existing requests, but don't accept new
ones". This is usually called during a driver's host crypto data
structure free (module unload) path.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:01:52 -04:00
James Ketrenos 42e349fd10 [PATCH] ieee80211: Fix time calculation, switching to use jiffies_to_msecs
tree b9cdd7058b787807655ea6f125e2adbf8d26c863
parent 85d9b2bddfcf3ed2eb4d061947c25c6a832891ab
author Zhu Yi <jketreno@io.(none)> 1124435212 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127312152 -0500

Fix time calculation, switching to use jiffies_to_msecs.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:01:52 -04:00
James Ketrenos f1bf6638af [PATCH] ieee80211: Hardware crypto and fragmentation offload support
tree 5322d496af90d03ffbec27292dc1a6268a746ede
parent 6c9364386ccb786e4a84427ab3ad712f0b7b8904
author James Ketrenos <jketreno@linux.intel.com> 1124432367 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127311810 -0500

Hardware crypto and fragmentation offload support added (Zhu Yi)

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:01:52 -04:00
James Ketrenos 20d64713ae [PATCH] ieee80211: Fixed a kernel oops on module unload
tree 367069f24fc38b4aa910e86ff40094d2078d8aa7
parent a33a198201
author James Ketrenos <jketreno@linux.intel.com> 1124430800 -0500
committer James Ketrenos <jketreno@linux.intel.com> 1127310571 -0500

Fixed a kernel oops on module unload by adding spin lock protection to
ieee80211's crypt handlers (thanks to Zhu Yi)

Modified scan result logic to report WPA and RSN IEs if set (vs.being
based on wpa_enabled)

Added ieee80211_device as the first parameter to the crypt init()
method.  TKIP modified to use that structure for determining whether to
countermeasures are active.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-21 23:01:52 -04:00
Jeff Garzik a3536c839f Merge /spare/repo/linux-2.6/ 2005-09-21 22:34:08 -04:00
Stephen Hemminger 7957aed72b [TCP]: Set default congestion control correctly for incoming connections.
Patch from Joel Sing to fix the default congestion control algorithm
for incoming connections. If a new congestion control handler is added
(via module), it should become the default for new
connections. Instead, the incoming connections use reno. The cause is
incorrect initialisation causes the tcp_init_congestion_control()
function to return after the initial if test fails.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21 00:19:46 -07:00
Stephen Hemminger 78c6671a88 [FIB_TRIE]: message cleanup
Cleanup the printk's in fib_trie:
	* Convert a couple of places in the dump code to BUG_ON
	* Put log level's on each message
The version message really needed the message since it leaks out
on the pretty Fedora bootup.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Robert Olsson <Robert.Olsson@data.slu.se>,
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21 00:15:39 -07:00
Eric W. Biederman 0fb375fb9b [AF_PACKET]: Allow for > 8 byte hardware addresses.
The convention is that longer addresses will simply extend
the hardeware address byte arrays at the end of sockaddr_ll and
packet_mreq.

In making this change a small information leak was also closed.
The code only initializes the hardware address bytes that are
used, but all of struct sockaddr_ll was copied to userspace.
Now we just copy sockaddr_ll to the last byte of the hardware
address used.

For error checking larger structures than our internal
maximums continue to be allowed but an error is signaled if we can
not fit the hardware address into our internal structure.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21 00:11:37 -07:00
Linus Torvalds 875bd5ab01 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-19 18:46:11 -07:00
Mark J Cox 6d1cfe3f17 [PATCH] raw_sendmsg DoS on 2.6
Fix unchecked __get_user that could be tricked into generating a
memory read on an arbitrary address.  The result of the read is not
returned directly but you may be able to divine some information about
it, or use the read to cause a crash on some architectures by reading
hardware state.  CAN-2004-2492.

Fix from Al Viro, ack from Dave Miller.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-19 18:45:42 -07:00
Herbert Xu e14c3caf60 [TCP]: Handle SACK'd packets properly in tcp_fragment().
The problem is that we're now calling tcp_fragment() in a context
where the packets might be marked as SACKED_ACKED or SACKED_RETRANS.
This was not possible before as you never retransmitted packets that
are so marked.

Because of this, we need to adjust sacked_out and retrans_out in
tcp_fragment().  This is exactly what the following patch does.

We also need to preserve the SACKED_ACKED/SACKED_RETRANS marking
if they exist.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 18:18:38 -07:00
Alexey Dobriyan 3c3f8f25c1 [8021Q]: Add endian annotations.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:41:28 -07:00
Harald Welte 8922bc93aa [NETFILTER]: Export ip_nat_port_{nfattr_to_range,range_to_nfattr}
Those exports are needed by the PPTP helper following in the next
couple of changes.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:35:57 -07:00
Patrick McHardy a41bc00234 [NETFILTER]: Rename misnamed function
Both __ip_conntrack_expect_find and ip_conntrack_expect_find_get take
a reference to the expectation, the difference is that callers of
__ip_conntrack_expect_find must hold ip_conntrack_lock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:35:31 -07:00
Yasuyuki Kozakai e674d0f38d [NETFILTER] ip6tables: remove duplicate code
Some IPv6 matches have very similar loops to find IPv6 extension header
and we can unify them. This patch introduces ipv6_find_hdr() to do it.
I just checked that it can find the target headers in the packet which has
dst,hbh,rt,frag,ah,esp headers.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:34:40 -07:00
Harald Welte 926b50f92a [NETFILTER]: Add new PPTP conntrack and NAT helper
This new "version 3" PPTP conntrack/nat helper is finally ready for
mainline inclusion.  Special thanks to lots of last-minute bugfixing
by Patric McHardy.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:33:08 -07:00
Robert Olsson 772cb712b1 [IPV4]: fib_trie RCU refinements
* This patch is from Paul McKenney's RCU reviewing. 

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:31:18 -07:00
Robert Olsson 1d25cd6cc2 [IPV4]: fib_trie tnode stats refinements
* Prints the route tnode and set the stats level deepth as before.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 15:29:52 -07:00
Harald Welte 628f87f3d5 [NETFILTER]: Solve Kconfig dependency problem
As suggested by Roman Zippel.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18 00:33:02 -07:00
Mitsuru KANDA 987905ded3 [IPV6]: Check connect(2) status for IPv6 UDP socket (Re: xfrm_lookup)
I think we should cache the per-socket route(dst_entry) only when the
IPv6 UDP socket is connect(2)'ed.
(which is same as IPv4 UDP send behavior)

Signed-off-by: Mitsuru KANDA <mk@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18 00:30:08 -07:00
Arnaldo Carvalho de Melo 88f964db6e [DCCP]: Introduce CCID getsockopt for the CCIDs
Allocation for the optnames is similar to the DCCP options, with a
range for rx and tx half connection CCIDs.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18 00:19:32 -07:00
Arnaldo Carvalho de Melo 561713cf47 [DCCP]: Don't use necessarily the same CCID for tx and rx
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18 00:18:52 -07:00
Arnaldo Carvalho de Melo 65299d6c3c [CCID3]: Introduce include/linux/tfrc.h
Moving the TFRC sender and receiver variables to separate structs, so
that we can copy these structs to userspace thru getsockopt,
dccp_diag, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18 00:18:32 -07:00
Arnaldo Carvalho de Melo ae31c3399d [DCCP]: Move the ack vector code to net/dccp/ackvec.[ch]
Isolating it, that will be used when we introduce a CCID2 (TCP-Like)
implementation.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-18 00:17:51 -07:00
Harald Welte 9eb0eec74d [NETFILTER] move nfnetlink options to right location in kconfig menu
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-17 00:41:21 -07:00
Harald Welte 777ed97f3e [NETFILTER] Fix Kconfig dependencies for nfnetlink/ctnetlink
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-17 00:41:02 -07:00
Harald Welte a8f39143ac [NETFILTER]: Fix oops in conntrack event cache
ip_ct_refresh_acct() can be called without a valid "skb" pointer.
This used to work, since ct_add_counters() deals with that fact.
However, the recently-added event cache doesn't handle this at all.

This patch is a quick fix that is supposed to be replaced soon by a cleaner
solution during the pending redesign of the event cache.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16 17:00:38 -07:00
KOVACS Krisztian 136e92bbec [NETFILTER] CLUSTERIP: use a bitmap to store node responsibility data
Instead of maintaining an array containing a list of nodes this instance
is responsible for let's use a simple bitmap. This provides the
following features:

  * clusterip_responsible() and the add_node()/delete_node() operations
    become very simple and don't need locking
  * the config structure is much smaller

In spite of the completely different internal data representation the
user-space interface remains almost unchanged; the only difference is
that the proc file does not list nodes in the order they were added.
(The target info structure remains the same.)

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16 17:00:04 -07:00
KOVACS Krisztian 4451362445 [NETFILTER] CLUSTERIP: introduce reference counting for entries
The CLUSTERIP target creates a procfs entry for all different cluster
IPs.  Although more than one rules can refer to a single cluster IP (and
thus a single config structure), removal of the procfs entry is done
unconditionally in destroy(). In more complicated situations involving
deferred dereferencing of the config structure by procfs and creating a
new rule with the same cluster IP it's also possible that no entry will
be created for the new rule.

This patch fixes the problem by counting the number of entries
referencing a given config structure and moving the config list
manipulation and procfs entry deletion parts to the
clusterip_config_entry_put() function.

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16 16:59:46 -07:00
Arnaldo Carvalho de Melo 67e6b62921 [DCCP]: Introduce DCCP_SOCKOPT_SERVICE
As discussed in the dccp@vger mailing list:

Now applications have to use setsockopt(DCCP_SOCKOPT_SERVICE, service[s]),
prior to calling listen() and connect().

An array of unsigned ints can be passed meaning that the listening sock accepts
connection requests for several services.

With this we can ditch struct sockaddr_dccp and use only sockaddr_in (and
sockaddr_in6 in the future).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16 16:58:40 -07:00
Arnaldo Carvalho de Melo 0c10c5d968 [DCCP]: More precisely set reset_code when sending RESET packets
Moving the setting of DCCP_SKB_CB(skb)->dccpd_reset_code to the places
where events happen that trigger sending a RESET packet.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16 16:58:33 -07:00
David S. Miller 37f7f421cc [NET]: Do not leak MSG_CMSG_COMPAT into userspace.
Noticed by Sridhar Samudrala.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-16 16:51:01 -07:00
James Ketrenos 262d8e4677 [PATCH] ieee80211 Switched to sscanf in store_debug_level
Switched to sscanf as per friendly comment in store_debug_level.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:11:58 -04:00
James Ketrenos 18294d8727 [PATCH] ieee80211 Cleanup memcpy parameters.
Cleanup memcpy parameters.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:11:58 -04:00
James Ketrenos 7b1fa54020 [PATCH] ieee80211 Removed ieee80211_info_element_hdr
Removed ieee80211_info_element_hdr structure as ieee80211_info_element
provides the same use.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:10:56 -04:00
James Ketrenos 68e4e036b8 [PATCH] Changed 802.11 headers to use ieee80211_info_element[0]
Changed 802.11 headers to use ieee80211_info_element as zero sized
array so that sizeof calculations do not account for IE sizes.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:10:56 -04:00
James Ketrenos 74079fdce4 [PATCH] ieee80211 Added wireless spy support
Added wireless spy support to Rx code path.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>

NOTE:  Looks like scripts/Lindent generated output different
than the Lindented version already in-kernel, hence all the
whitespace deltas...  *sigh*
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:06:32 -04:00
James Ketrenos b1b508e1b1 [PATCH] ieee80211 quality scaling algorithm extension handler
Incorporated Bill Moss' quality scaling algorithm into default wireless
extension handler.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:06:32 -04:00
James Ketrenos fd27817ce9 [PATCH] Fixed some endian issues with 802.11 header usage in ieee80211_rx.c
Fixed some endian issues with 802.11 header usage in ieee80211_rx.c

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-16 03:06:31 -04:00
David L Stevens 40796c5e8f [IPV6]: Fix per-socket multicast filtering in sk_reuse case
per-socket multicast filters were not being applied to all sockets
in the case of an exact-match bound address, due to an over-exuberant
"return" in the look-up code. Fix below. IPv4 does not have this problem.

Thanks to Hoerdt Mickael for reporting the bug.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:10:20 -07:00
Julian Anastasov 87375ab47c [IPVS]: ip_vs_ftp breaks connections using persistence
ip_vs_ftp when loaded can create NAT connections with unknown client
port for passive FTP. For such expectations we lookup with cport=0 on
incoming packet but it matches the format of the persistence templates
causing packets to other persistent virtual servers to be forwarded to
real server without creating connection. Later the reply packets are
treated as foreign and not SNAT-ed.

This patch changes the connection lookup for packets from clients:

* introduce IP_VS_CONN_F_TEMPLATE connection flag to mark the
  connection as template

* create new connection lookup function just for templates -
  ip_vs_ct_in_get

* make sure ip_vs_conn_in_get hits only connections with
  IP_VS_CONN_F_NO_CPORT flag set when s_port is 0. By this way
  we avoid returning template when looking for cport=0 (ftp)

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:08:51 -07:00
Julian Anastasov f5e229db9c [IPVS]: Really invalidate persistent templates
Agostino di Salle noticed that persistent templates are not
invalidated due to buggy optimization.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:04:23 -07:00
Bart De Schuymer 1c011bed5f [BRIDGE-NF]: Fix iptables redirect on bridge interface
Here's a slightly altered patch, originally from Mark Glines who
diagnosed and fixed the problem.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 20:55:16 -07:00
Denis Lukianov de9daad90e [MCAST]: Fix MCAST_EXCLUDE line dupes
This patch fixes line dupes at /ipv4/igmp.c and /ipv6/mcast.c in the  
2.6 kernel, where MCAST_EXCLUDE is mistakenly used instead of  
MCAST_INCLUDE.

Signed-off-by: Denis Lukianov <denis@voxelsoft.com>
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 20:53:42 -07:00
Herbert Xu 3c05d92ed4 [TCP]: Compute in_sacked properly when we split up a TSO frame.
The problem is that the SACK fragmenting code may incorrectly call
tcp_fragment() with a length larger than the skb->len.  This happens
when the skb on the transmit queue completely falls to the LHS of the
SACK.

And add a BUG() check to tcp_fragment() so we can spot this kind of
error more quickly in the future.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 20:50:35 -07:00
David S. Miller 033d974405 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2005-09-13 16:32:40 -07:00
Arnaldo Carvalho de Melo 2b80230a7f [DCCP]: Handle SYNC packets in dccp_rcv_state_process
Eliciting a SYNCACK in response, we were handling SYNC packets
only in the DCCP_OPEN state, in dccp_rcv_established.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-13 19:05:08 -03:00
Arnaldo Carvalho de Melo 811265b8e8 [DCCP]: Check if already in the CLOSING state in dccp_rcv_closereq
It is possible to receive more than one CLOSEREQ packet if the
CLOSE packet sent in response is somehow lost, change the state
to DCCP_CLOSING only on the first CLOSEREQ packet received.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-13 19:03:15 -03:00
David S. Miller ae01d2798d Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2005-09-13 14:03:09 -07:00
Patrick McHardy adcb5ad1e5 [NETFILTER]: Fix DHCP + MASQUERADE problem
In 2.6.13-rcX the MASQUERADE target was changed not to exclude local
packets for better source address consistency. This breaks DHCP clients
using UDP sockets when the DHCP requests are caught by a MASQUERADE rule
because the MASQUERADE target drops packets when no address is configured
on the outgoing interface. This patch makes it ignore packets with a
source address of 0.

Thanks to Rusty for this suggestion.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13 13:49:15 -07:00
Patrick McHardy cd0bf2d796 [NETFILTER]: Fix rcu race in ipt_REDIRECT
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13 13:48:58 -07:00
Patrick McHardy e7fa1bd93f [NETFILTER]: Simplify netbios helper
Don't parse the packet, the data is already available in the conntrack
structure.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13 13:48:34 -07:00
Patrick McHardy 5cb30640ce [NETFILTER]: Use correct type for "ports" module parameter
With large port numbers the helper_names buffer can overflow.
Noticed by Samir Bellabes <sbellabes@mandriva.com>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-13 13:48:00 -07:00
Neil Brown 939bb7ef90 [PATCH] Code cleanups in calbacks in svcsock
Change a printk(KERN_WARNING to dprintk, and it is really only interesting
when trying to debug a problem, and can occur normally without error.

Remove various gratuitous gotos in surrounding code, and remove some
type-cast assignments from inside 'if' conditionals, as that is just
obscuring what it going on.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:32 -07:00
Marcel Holtmann 354d28d5f8 [Bluetooth] Prevent RFCOMM connections through the RAW socket
This patch adds additional checks to prevent RFCOMM connections be
established through the RAW socket interface.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-09-13 01:32:31 +02:00
Marcel Holtmann 21d9e30ed0 [Bluetooth] Add support for extended inquiry responses
This patch adds the handling of the extended inquiry responses and
inserts them into the inquiry cache.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2005-09-13 01:32:25 +02:00
Ralf Baechle b88a762b60 [NETROM]: Introduct stuct nr_private
NET/ROM's virtual interfaces don't have a proper private data
structure yet.  Create struct nr_private and put the statistics there.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:28:03 -07:00
Ralf Baechle e21ce8c7c0 [NETROM]: Implement G8PZT Circuit reset for NET/ROM
NET/ROM is lacking a connection reset like TCP's RST flag which at times
may result in a connecting having to slowly timing out instead of just being
reset.  An earlier attempt to reset the connection by sending a
NR_CONNACK | NR_CHOKE_FLAG transport was inacceptable as it did result in
crashes of BPQ systems.  An alternative approach of introducing a new
transport type 7 (NR_RESET) has be implemented several years ago in
Paula Jayne Dowie G8PZT's Xrouter.

Implement NR_RESET for Linux's NET/ROM but like any messing with the state
engine consider this experimental for now and thus control it by a sysctl
(net.netrom.reset) which for the time being defaults to off.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:27:37 -07:00
Ralf Baechle d2ce4bc340 [ROSE]: ROSE has no ARP
ARP over ROSE does not exist so it's obviously not implemented on any
ROSE stack, so the ROSE interfaces really should default to IFF_NOARP.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:26:52 -07:00
Ralf Baechle 723772913e [NETROM]: NET/ROM has no ARP
ARP over NET/ROM does not exist so it's obviously not implemented on any
NET/ROM stack, so the NET/ROM interfaces really should default to IFF_NOARP.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:26:26 -07:00
Ralf Baechle dd8aa40431 [NETROM] NET/ROM has no txqueue
NET/ROM uses virtual interfaces so setting a queue length is wrong.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:25:57 -07:00
Ralf Baechle 4676356b57 [AX.25]: Reformat ax25_proto_ops initialization
Reformat iniitalization of ax25_proto_ops.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:25:25 -07:00
Ralf Baechle 20b7d10a33 [AX.25/ROSE]: Whitespace formatting changes
Small formatting changes.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:24:55 -07:00
Ralf Baechle 9b37ee7585 [NETROM/AX.25/ROSE]: Remove useless tests
Remove error tests that have already been performed by the caller.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:23:52 -07:00
Ralf Baechle 6ddcf626fd [NETROM]: statistics fix
Calling an incoming NET/ROM-encapsulated IP packet an error if the
interface isn't up is probably a bit over the top, so count it as
dropped instead of an error.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:23:06 -07:00
Ralf Baechle 3f2aadd041 [NETROM]: Fix rebuild header mess
For reason that probably nobody recalls NET/ROM does it's actual
packet transmission in nr_rebuild_header and even treats invocation of
it's hard_start_xmit method nr_xmit as a bug.  Fix that by splitting
the job done by nr_rebuild_header into two halves.  Along with that we
now also can get rid of the silly clone of the skb on transmit.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:21:48 -07:00
Ralf Baechle 6f74998e5c [AX.25]: Rename ax25_encapsulate to ax25_hard_header
Rename ax25_encapsulate to ax25_hard_header which these days more
accurately describes what the function is supposed to do.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:21:01 -07:00
Arnaldo Carvalho de Melo 59c2353dd0 [CCID3]: Listen socks doesn't have a private CCID block
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:16:58 -07:00
Nishanth Aravamudan 121caf577d [NET]: fix-up schedule_timeout() usage
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.  Also use
human-time conversion functions instead of hard-coded division to avoid
rounding issues.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:15:34 -07:00
Herbert Xu e130af5dab [TCP]: Fix double adjustment of tp->{lost,left}_out in tcp_fragment().
There is an extra left_out/lost_out adjustment in tcp_fragment which
means that the lost_out accounting is always wrong.  This patch removes
that chunk of code.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-10 17:19:09 -07:00
Brian Haley e6df439b89 [IPV6]: Bring Type 0 routing header in-line with rfc3542.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-10 00:15:06 -07:00
David S. Miller 41c29dd15b Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2005-09-10 00:01:36 -07:00
Arnaldo Carvalho de Melo 59d203f9e9 [CCID3] Cleanup ccid3 debug calls
Also use some BUG_ON where appropriate and use LIMIT_NETDEBUG for the unlikely
cases where we, at this stage, want to know about, that in my tests hasn't
appeared in the radar.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 20:01:25 -03:00
Arnaldo Carvalho de Melo dc19336c76 [DCCP] Only call the HC _exit() routines in dccp_v4_destroy_sock
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 19:59:26 -03:00
Arnaldo Carvalho de Melo d7e0fb985c [CCID3] Initialize ccid3hctx_t_ipi to 250ms
To match more closely what is described in RFC 3448.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>
2005-09-09 19:58:18 -03:00
Linus Torvalds 1d8674edb5 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-09 14:25:22 -07:00
Ingo Molnar a9f6a0dd54 [PATCH] more SPIN_LOCK_UNLOCKED -> DEFINE_SPINLOCK conversions
This converts the final 20 DEFINE_SPINLOCK holdouts.  (another 580 places
are already using DEFINE_SPINLOCK).  Build tested on x86.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Ingo Molnar 8d06afab73 [PATCH] timer initialization cleanup: DEFINE_TIMER
Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
been in the -RT tree for some time.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Dipankar Sarma b835996f62 [PATCH] files: lock-free fd look-up
With the use of RCU in files structure, the look-up of files using fds can now
be lock-free.  The lookup is protected by rcu_read_lock()/rcu_read_unlock().
This patch changes the readers to use lock-free lookup.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:55 -07:00
Stephen Hemminger cb7b593c2c [IPV4] fib_trie: fix proc interface
Create one iterator for walking over FIB trie, and use it
for all the /proc functions. Add a /proc/net/route
output for backwards compatibility with old applications.

Make initialization of fib_trie same as fib_hash so no #ifdef
is needed in af_inet.c

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=5209

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-09 13:35:42 -07:00
David S. Miller 8259f16257 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2005-09-09 13:17:43 -07:00
Arnaldo Carvalho de Melo 59725dc2a2 [CCID3] Introduce ccid3_hc_[rt]x_sk() for overal consistency
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:40:58 -03:00
Arnaldo Carvalho de Melo b0e567806d [DCCP] Introduce dccp_timestamp
To start the timestamps with 0.0ms, easing the integer maths in the CCIDs, this
probably will be reworked to use the to be introduced struct timeval_offset
infrastructure out of skb_get_timestamp, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:38:35 -03:00
Arnaldo Carvalho de Melo 954ee31f36 [CCID3] Initialize more fields in ccid3_hc_rx_init
The initialization of ccid3hcrx_rtt to 5ms is just a bandaid, I'll continue
auditing the CCID3 HC rx codebase to fix this properly, probably I'll add a
feedback timer as suggested in the CCID3 draft.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:37:05 -03:00
Arnaldo Carvalho de Melo b3a3077d96 [CCID3] Make the ccid3hcrx_rtt calc look more like the ccid3hctx_rtt one
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:34:10 -03:00
Arnaldo Carvalho de Melo 1a28599a2c [CCID3] Use ELAPSED_TIME in the HC TX RTT estimation
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:32:56 -03:00
Arnaldo Carvalho de Melo 1c14ac0ae8 [DCCP] Give precedence to the biggest ELAPSED_TIME
We can get this value in an TIMESTAMP_ECHO and/or in an ELAPSED_TIME option, if
receiving both give precendence to the biggest one.

In my tests they are very close if not equal at all times, so we may well think
about removing the code in CCID3 that inserts this option and leaving this to
the core, and perhaps even use just TIMESTAMP_ECHO including the elapsed time.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:32:01 -03:00
Arnaldo Carvalho de Melo 27ae543e6f [CCID3] Calculate ccid3hcrx_x_recv using usecs_div
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:31:07 -03:00
Arnaldo Carvalho de Melo 507d37cf26 [CCID] Only call the HC insert_options methods when requested
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:30:07 -03:00
Arnaldo Carvalho de Melo 0ba7a3ba66 [CCID3] Avoid unsigned integer overflows in usecs_div
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-09 02:28:47 -03:00
Linus Torvalds 27e2df2228 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-08 15:52:11 -07:00
Patrick McHardy e104411b82 [XFRM]: Always release dst_entry on error in xfrm_lookup
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 15:11:55 -07:00
Herbert Xu cf0b450cd5 [TCP]: Fix off by one in tcp_fragment() "already sent" test.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 15:10:52 -07:00
Patrick McHardy a57ebc90f1 [IPV6]: Don't redo xfrm_lookup for cached dst entries
The xfrm lookup is already done when the dst entry is looked up first and
stored in the cache.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 14:27:47 -07:00
Linus Torvalds 1ee9bed173 Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-09-08 13:52:24 -07:00
Jeff Garzik 45ac56ca64 Kconfig: IEEE80211 should not depend on NET_RADIO
We should not restrict use of ieee80211 to only when wireless drivers
are enabled.  In-development and out-of-tree drivers may wish to use it,
and by removing this restriction we eliminate a circular dependency.
2005-09-08 16:44:33 -04:00
Ralf Baechle baed16a7ff [AX.25]: Make asc2ax() thread-proof
Asc2ax was still using a static buffer for all invocations which isn't
exactly SMP-safe.  Change asc2ax to take an additional result buffer as
the argument.  Change all callers to provide such a buffer.

This one only really is a fix for ROSE and as per recent discussions
there's still much more to fix in ROSE ...

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 13:40:41 -07:00
Andrew Morton 3a93481589 [NETFILTER]: ip_conntrack_netbios_ns.c gcc-2.95.x build fix
gcc-2.95.x can't do this sort of initialisation

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 13:36:34 -07:00
Julian Anastasov ce723d8e04 [IPV4]: Fix refcount damaging in net/ipv4/route.c
One such place that can damage the dst refcnts is route.c with
CONFIG_IP_ROUTE_MULTIPATH_CACHED enabled, i don't see the user's
.config. In this new code i see that rt_intern_hash is called before
dst->refcnt is set to 1, dst is the 2nd arg to rt_intern_hash.

Arg 2 of rt_intern_hash must come with refcnt 1 as it is added to
table or dropped depending on error/add/update. One such example is
ip_mkroute_input where __mkroute_input return rth with refcnt 0 which
is provided to rt_intern_hash. ip_mkroute_output looks like a 2nd such
place. Appending untested patch for comments and review.  The idea is
to put previous reference as we are going to return next result/error.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 13:34:47 -07:00
David S. Miller 2e66fc4116 Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6-git-rfc3542 2005-09-08 12:59:43 -07:00
Stephen Hemminger 42ca89c18b [IPV6]: Need to use pskb_trim_rcsum().
Fix pskb_trim usage in ipv6. Only the udp one is really
a bug, other places are just doing equivalent code.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 12:57:43 -07:00
Stephen Hemminger e308e25c97 [IPV4] udp: trim forgets about CHECKSUM_HW
A UDP packet may contain extra data that needs to be trimmed off.
But when doing so, UDP forgets to fixup the skb checksum if CHECKSUM_HW
is being used.

I think this explains the case of a NFS receive using skge driver
causing 'udp hw checksum failures' when interacting with a crufty
settop box.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 12:32:21 -07:00
Al Viro 8920e8f94c [PATCH] Fix 32bit sendmsg() flaw
When we copy 32bit ->msg_control contents to kernel, we walk the same
userland data twice without sanity checks on the second pass.

Second version of this patch: the original broke with 64-bit arches
running 32-bit-compat-mode executables doing sendmsg() syscalls with
unaligned CMSG data areas

Another thing is that we use kmalloc() to allocate and sock_kfree_s()
to free afterwards; less serious, but also needs fixing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-08 08:14:11 -07:00
YOSHIFUJI Hideaki 41a1f8ea4f [IPV6]: Support IPV6_{RECV,}TCLASS socket options / ancillary data.
Based on patch from David L Stevens <dlstevens@us.ibm.com>

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-09-08 10:19:03 +09:00
YOSHIFUJI Hideaki 333fad5364 [IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542).
Support several new socket options / ancillary data:
  IPV6_RECVPKTINFO, IPV6_PKTINFO,
  IPV6_RECVHOPOPTS, IPV6_HOPOPTS,
  IPV6_RECVDSTOPTS, IPV6_DSTOPTS, IPV6_RTHDRDSTOPTS,
  IPV6_RECVRTHDR, IPV6_RTHDR,
  IPV6_RECVHOPOPTS, IPV6_HOPOPTS

Old semantics are preserved as IPV6_2292xxxx so that
we can maintain backward compatibility.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-09-08 09:59:17 +09:00
Linus Torvalds 55faed1e60 Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-09-07 17:22:43 -07:00
Linus Torvalds f7402dc44d Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-07 17:20:11 -07:00
Max Kellermann 49e31cbac5 [PATCH] sunrpc: print unsigned integers in stats
The sunrpc stats are collected in unsigned integers, but they are printed
with '%d'.  That can result in negative numbers in /proc/net/rpc when the
highest bit of a counter is set.  The following patch changes '%d' to '%u'
where appropriate.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:39 -07:00
Bruce Allan f35279d3f7 [PATCH] sunrpc: cache_register can use wrong module reference
When registering an RPC cache, cache_register() always sets the owner as the
sunrpc module.  However, there are RPC caches owned by other modules.  With
the incorrect owner setting, the real owning module can be removed potentially
with an open reference to the cache from userspace.

For example, if one were to stop the nfs server and unmount the nfsd
filesystem, the nfsd module could be removed eventhough rpc.idmapd had
references to the idtoname and nametoid caches (i.e.
/proc/net/rpc/nfs4.<cachename>/channel is still open).  This resulted in a
system panic on one of our machines when attempting to restart the nfs
services after reloading the nfsd module.

The following patch adds a 'struct module *owner' field in struct
cache_detail.  The owner is further assigned to the struct proc_dir_entry
in cache_register() so that the module cannot be unloaded while user-space
daemons have an open reference on the associated file under /proc.

Signed-off-by: Bruce Allan <bwa@us.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:25 -07:00
Jeff Garzik 0edd5b4491 [wireless ieee80211,ipw2200] Lindent source code
No code changes, just Lindent + manual fixups.

This prepares us for updating to the latest Intel driver code, plus
gives the source code a nice facelift.
2005-09-07 00:48:31 -04:00
Jeff Garzik bbeec90b98 [wireless] build fixes after merging WE-19 2005-09-07 00:27:54 -04:00
Max Kellermann 832079d29a [SUNRPC]: print unsigned integers in stats
From: Max Kellermann <max@duempel.org>

The sunrpc stats are collected in unsigned integers, but they are printed
with '%d'.  That can result in negative numbers in /proc/net/rpc when the
highest bit of a counter is set.  The following patch changes '%d' to '%u'
where appropriate.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 20:04:59 -07:00
Patrick McHardy 0a3f4358ac [NET]: proto_unregister: fix sleeping while atomic
proto_unregister holds a lock while calling kmem_cache_destroy, which
can sleep.

Noticed by Daniele Orlandi <daniele@orlandi.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 19:47:50 -07:00
Jean Tourrilhes 6582c164f2 [PATCH] WE-19 for kernel 2.6.13
Hi Jeff,

	This is version 19 of the Wireless Extensions. It was supposed
to be the fallback of the WPA API changes, but people seem quite happy
about it (especially Jouni), so the patch is rather small.
	The patch has been fully tested with 2.6.13 and various
wireless drivers, and is in its final version. Would you mind pushing
that into Linus's kernel so that the driver and the apps can take
advantage ot it ?

	It includes :
	o iwstat improvement (explicit dBm). This is the result of
long discussions with Dan Williams, the authors of
NetworkManager. Thanks to him for all the fruitful feedback.
	o remove pointer from event stream. I was not totally sure if
this pointer was 32-64 bits clean, so I'd rather remove it and be at
peace with it.
	o remove linux header from wireless.h. This has long been
requested by people writting user space apps, now it's done, and it
was not even painful.
	o final deprecation of spy_offset. You did not like it, it's
now gone for good.
	o Start deprecating dev->get_wireless_stats -> debloat netdev
	o Add "check" version of event macros for ieee802.11
stack. Jiri Benc doesn't like the current macros, we aim to please ;-)
	All those changes, except the last one, have been bit-roting on
my web pages for a while...

	Patches for most kernel drivers will follow. Patches for the
Orinoco and the HostAP drivers have been sent to their respective
maintainers.

	Have fun...

	Jean
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-06 22:40:24 -04:00
Stephen Hemminger 48bc41a49c [IPV4]: Reassembly trim not clearing CHECKSUM_HW
This was found by inspection while looking for checksum problems
with the skge driver that sets CHECKSUM_HW. It did not fix the
problem, but it looks like it is needed.

If IP reassembly is trimming an overlapping fragment, it
should reset (or adjust) the hardware checksum flag on the skb.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:51:48 -07:00
Ralf Baechle f75268cd6c [AX25]: Make ax2asc thread-proof
Ax2asc was still using a static buffer for all invocations which isn't
exactly SMP-safe.  Change ax2asc to take an additional result buffer as
the argument.  Change all callers to provide such a buffer.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:49:39 -07:00
Patrick McHardy 513c250000 [NETLINK]: Don't prevent creating sockets when no kernel socket is registered
This broke the pam audit module which includes an incorrect check for
-ENOENT instead of -EPROTONOTSUPP.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:43:59 -07:00
Patrick McHardy e446639939 [NETFILTER]: Missing unlock in TCP connection tracking error path
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:11:10 -07:00
Pablo Neira Ayuso 49719eb355 [NETFILTER]: kill __ip_ct_expect_unlink_destroy
The following patch kills __ip_ct_expect_unlink_destroy and export
unlink_expect as ip_ct_unlink_expect. As it was discussed [1], the function
__ip_ct_expect_unlink_destroy is a bit confusing so better do the following
sequence: ip_ct_destroy_expect and ip_conntrack_expect_put.

[1] https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020794.html

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:10:46 -07:00
Pablo Neira Ayuso 91c46e2e60 [NETFILTER]: Don't increase master refcount on expectations
As it's been discussed [1][2]. We shouldn't increase the master conntrack
refcount for non-fulfilled conntracks. During the conntrack destruction,
the expectations are always killed before the conntrack itself, this
guarantees that there won't be any orphan expectation.

[1]https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020783.html
[2]https://lists.netfilter.org/pipermail/netfilter-devel/2005-August/020904.html

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:10:23 -07:00
Patrick McHardy e7dfb09a36 [NETFILTER]: Fix HW checksum handling in nfnetlink_queue
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:10:00 -07:00
Patrick McHardy 03486a4f83 [NETFILTER]: Handle NAT module load race
When the NAT module is loaded when connections are already confirmed
it must not change their tuples anymore. This is especially important
with CONFIG_NETFILTER_DEBUG, the netfilter listhelp functions will
refuse to remove an entry from a list when it can not be found on
the list, so when a changed tuple hashes to a new bucket the entry
is kept in the list until and after the conntrack is freed.

Allocate the exact conntrack tuple for NAT for already confirmed
connections or drop them if that fails.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:09:43 -07:00
Yasuyuki Kozakai 31c913e7fd [NETFILTER]: Fix CONNMARK Kconfig dependency
Connection mark tracking support is one of the feature in connection
tracking, so IP_NF_CONNTRACK_MARK depends on IP_NF_CONNTRACK.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:09:20 -07:00
Patrick McHardy a2978aea39 [NETFILTER]: Add NetBIOS name service helper
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:08:51 -07:00
Patrick McHardy 2248bcfcd8 [NETFILTER]: Add support for permanent expectations
A permanent expectation exists until timeing out and can expect
multiple related connections.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 15:06:42 -07:00
Eric Dumazet 9261c9b042 [NET]: Make sure l_linger is unsigned to avoid negative timeouts
One of my x86_64 (linux 2.6.13) server log is filled with :

schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca

This is because some application does a

struct linger li;
li.l_onoff = 1;
li.l_linger = -1;
setsockopt(sock, SOL_SOCKET, SO_LINGER, &li, sizeof(li));

And unfortunatly l_linger is defined as a 'signed int' in
include/linux/socket.h:

struct linger {
         int             l_onoff;        /* Linger active                */
         int             l_linger;       /* How long to linger for       */
};

I dont know if it's safe to change l_linger to 'unsigned int' in the
include file (It might be defined as int in ABI specs)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 14:51:39 -07:00
Eric Dumazet b69aee04fb [NET]: Use file->private_data to get socket pointer.
Avoid touching file->f_dentry on sockets, since file->private_data
directly gives us the socket pointer.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 14:42:45 -07:00
Linus Torvalds 5bcaa15579 Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-09-06 00:47:18 -07:00
David S. Miller 4c2cac8908 [IEEE80211]: Use correct size_t printf format string in ieee80211_rx.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 23:19:49 -07:00
Herbert Xu fb5f5e6e0c [TCP]: Fix TCP_OFF() bug check introduced by previous change.
The TCP_OFF assignment at the bottom of that if block can indeed set
TCP_OFF without setting TCP_PAGE.  Since there is not much to be
gained from avoiding this situation, we might as well just zap the
offset.  The following patch should fix it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:55:48 -07:00
Herbert Xu 1198ad002a [NET]: 2.6.13 breaks libpcap (and tcpdump)
Patrick McHardy says:

  Never mind, I got it, we never fall through to the second switch
  statement anymore. I think we could simply break when load_pointer
  returns NULL. The switch statement will fall through to the default
  case and return 0 for all cases but 0 > k >= SKF_AD_OFF.

Here's a patch to do just that.

I left BPF_MSH alone because it's really a hack to calculate the IP
header length, which makes no sense when applied to the special data.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:44:37 -07:00
David S. Miller 6baf1f417d [NET]: Do not protect sysctl_optmem_max with CONFIG_SYSCTL
The ipv4 and ipv6 protocols need to access it unconditionally.
SYSCTL=n build failure reported by Russell King.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:14:11 -07:00
Harald Welte aa07ca5793 [NETFILTER] remove bogus hand-coded htonll() from nenetlink_queue
htonll() is nothing else than cpu_to_be64(), so we'd rather call the
latter.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:09:08 -07:00
Adrian Bunk 506e7beb74 [IRDA]: IrDA prototype fixes
Every file should #include the header files containing the prototypes
of it's global functions.

In this case this showed that the prototype of irlan_print_filter()
was wrong which is also corrected in this patch.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:08:11 -07:00
Adrian Bunk 8c5955d83e [SCTP]: net/sctp/sysctl.c should #include <net/sctp/sctp.h>
Every file should #include the header files containing the prototypes of
it's global functions.

sctp.h contains the prototypes of sctp_sysctl_{,un}register().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:07:42 -07:00
Adrian Bunk 395dde20fb [NETFILTER]: net/netfilter/nfnetlink*: make functions static
This patch makes needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:06:45 -07:00
Adrian Bunk 43d60661ac [IPV4]: net/ipv4/ipconfig.c should #include <linux/nfs_fs.h>
Every file should #include the header files containing the prototypes of 
it's global functions.

nfs_fs.h contains the prototype of root_nfs_parse_addr().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:05:52 -07:00
Adrian Bunk 295098e9f4 [ATM]: net/atm/ioctl.c should #include "common.h"
Every file should #include the header files containing the prototypes
of it's global functions.

common.h contains the prototype for vcc_ioctl().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:04:28 -07:00