The exact details are dependent on the operating system. However, the following will attempt to illustrate the usual technique used in various implementations: The user process opens a device or issues a system call which gives it a descriptor with which it can read packets off the wire. The kernel then passes the packets straight to the process.
However, this wouldn't work too well on a busy network or a slow machine. The user process has to read the packets as fast as they appear on the network. That's where buffering and packet filtering come in.
The kernel will buffer up to X bytes of packet data, and pass the packets one by one at the user's request. If the amount exceeds a certain limit (resources are finite), the packets are dropped and are not placed in the buffer.
Packet filters allow a process to dictate which packets it's interested in. The usual way is to have a set of opcodes for routines to perform on the packet, reading values off it, and deciding whether or not it's wanted. These opcodes usually perform very simple operations, allowing powerful filters to be constructed.
BPF filters and then buffers; this is optimal since the buffer only contains packets that are interesting to the process. It's hoped that the filter cuts down the amount of packets buffered to stop overflowing the buffer, which leads to packet loss.
NIT, unfortunately, does not do this; it applies the filter after buffering, when the user process starts to read from the buffered data.
According to route route@infonexus.com Linux' SOCK_PACKET does not do any buffering and has no kernel filtering.
Your mileage may vary with other packet capturing facilities.