dlm_recvd small picture
Description
As dlm_recvd always is reported in D state when the cluster freezes it might be interesting to understand its work.
As dlm_recvd is defined in lowcomms.c here the description of what lowcomms.c does (stolen from the code).
lowcomms.c
This is the "low-level" comms layer.
It is responsible for sending/receiving messages from other nodes in the cluster.
Cluster nodes are referred to by their nodeids. nodeids are simply 32 bit numbers to the locking module - if they need to be expanded for the cluster infrastructure then that is it's responsibility. It is this layer's responsibility to resolve these into IP address or whatever it needs for inter-node communication.
The comms level is two kernel threads that deal mainly with the receiving of messages from other nodes and passing them up to the mid-level comms layer (which understands the message format) for execution by the locking core, and a send thread which does all the setting up of connections to remote nodes and the sending of data. Threads are not allowed to send their own data because it may cause them to wait in times of high load. Also, this way, the sending thread can collect together messages bound for one node and send them in one block.
I don't see any problem with the recv thread executing the locking code on behalf of remote processes as the locking code is short, efficient and never waits.
Architecture
dlm/lowcomms.c(dlm_recvd(void* data))
|
-> init_waitq and set status to ready to wait
|
-> while (!kthread_should_stop())
|
-> schedule this kthread again if tasklist is empty
-> "else" process_sockets()
dlm/lowcomms.c(process_sockets)
|
-> spin_lock_bh(&read_sockets_lock)
|
-> list_for_each_safe(list, temp, &read_sockets) // kernel function: go through list head is &read_sockets store each element in temp and list is the cursor
|
-> struct connection *con = list_entry..
|
-> list_del(&con->read_list) // remove the read_list from the connection
|
-> spin_unlock_bh(&read_sockets_lock);
|
-> continue if the request in the connection is not completely come in then spin_lock_bh(&read_sockets_lock)
|
-> do
|
-> con->rx_action(con) // struct connection.rx_action point to what function??
|
-> schedule() if too many iterations (MAX_RX_MSG_COUNT)
|
-> while (!atomic_dec_and_test(&con->waiting_requests) && !kthread_should_stop()) //ok. If we can atomically decrement con->waitingrequests or the kthread should not stop we continue.
|
-> spin_lock_bh(&read_sockets_lock)
|
-> spin_unlock_bh(&read_sockets_lock)
int rx_action(struct connection)
Is the type definition of the action to be executed when the connection is active (see above). Possible rx_action are:
lowcomms.c/receive_from_sock:
Seems to read the data and convert to more sematic structures.
* lowcomms.c/accept_from_sock:
blu
Questions
If it deadlocks then is either the spinlock read_sockets_lock the cause or the rx_action function?
The spinlock is also used by lowcomms_data_ready.