samba-users May 2010 archive
Main Archive Page > Month Archives  > samba-users archives
samba-users: [Samba] winbind ubuntu 9.10 crashing machine

[Samba] winbind ubuntu 9.10 crashing machine

From: Jim Kusznir <jkusznir_at_nospam>
Date: Mon May 10 2010 - 17:14:58 GMT
To: samba@lists.samba.org

Hi all:

I've got a couple Ubuntu 9.10 machines that are suffering from a
recurring failure of winbind that essentially crash the machine. When
the system is in the "crashed state", one can ping the system, but all
forms of login fail. It will not even respond to tftpd requests; ssh
connections "time out", but the initial port is opened (just no
connect). Rebooting does NOT recover from this, in order to recover,
I need to:

1) reboot into single user mode
2) edit /etc/nsswitch.conf and remove winbind
3) remove winbind from all pam.d/*
4) boot normally
5) stop samba and winbind
6) delete /var/lib/samba/* and /var/cache/samba/*
7) start samba
8) rejoin doimain
9) start winbind
10) undo #2 and 3 above

After this, winbind will work for a week or two. If I stop after step
4 above the system is usable, but without domain users able to log in.
 My diagnostics show that net ads users (and all other "samba"
commands) work just fine and find all users. All winbind-specific
commands (wbinfo -u, etc) fail. Oh, if I leave the system up in the
crashed state, it begins to fill up logs to the tune of 32gigs in a
few days. The above procedure repeats approximately once every 5 days
on our main production system. I have a second workstation that sees
very little use, and it has suffered the same crash, but far less
frequently. I have also tried inserting step 6.5 where I delete the
machine account on the DC, but that doesn't change anything. Also,
our Ubuntu 9.04 system running the same configuration files has no
issues. We have not tried 10.04.

This problem has been plaguing our operations for over two months now,
so any assistance would be greatly appreciated.

Some log file snippits:

(from some point "in the middle" of the crash):
May 7 15:32:45 casas-lin winbindd[20677]: sys_select: pipe failed
(Too many open files)
May 7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45, 0]
lib/events.c:287(s3_event
_debug)
May 7 15:32:45 casas-lin winbindd[20677]: s3_event: sys_select()
failed: 24:Too many open f
iles
May 7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45, 0]
lib/select.c:64(sys_selec
t)
May 7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45, 0]
lib/debug.c:663(reopen_lo
gs)
May 7 15:32:45 casas-lin winbindd[20677]: Unable to open new log
file /var/log/samba/log.wb
-CASAS: Too many open files
------
>From startup (step 4 above):
May 10 08:36:50 casas-lin kernel: May 10 08:38:42 casas-lin
winbindd[1571]: [2010/05/10 08:38:
42, 0] libsmb/smb_signing.c:255(signing_good)
May 10 08:38:42 casas-lin winbindd[1571]: signing_good: BAD SIG: seq 41
May 10 08:42:25 casas-lin winbindd[1562]: [2010/05/10 08:42:25, 0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:42:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1571 is n
ot responding. Closing connection to it.
May 10 08:42:25 casas-lin winbindd[1571]: [2010/05/10 08:42:25, 0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
May 10 08:42:25 casas-lin winbindd[1571]: Got sig[15] terminate (is_parent=0)
May 10 08:42:25 casas-lin winbindd[1825]: [2010/05/10 08:42:25, 0]
rpc_client/cli_pipe.c:687(
cli_pipe_verify_schannel)
May 10 08:42:25 casas-lin winbindd[1825]: cli_pipe_verify_schannel:
auth_len 56.
May 10 08:43:37 casas-lin winbindd[1825]: [2010/05/10 08:43:37, 0]
libsmb/smb_signing.c:255(s
igning_good)
May 10 08:43:37 casas-lin winbindd[1825]: signing_good: BAD SIG: seq 23
May 10 08:47:25 casas-lin winbindd[1562]: [2010/05/10 08:47:25, 0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:47:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1825 is n
ot responding. Closing connection to it.
May 10 08:47:25 casas-lin winbindd[1825]: [2010/05/10 08:47:25, 0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
May 10 08:47:25 casas-lin winbindd[1825]: Got sig[15] terminate (is_parent=0)
May 10 08:47:25 casas-lin winbindd[1832]: [2010/05/10 08:47:25, 0]
rpc_client/cli_pipe.c:687(
cli_pipe_verify_schannel)
May 10 08:47:25 casas-lin winbindd[1832]: cli_pipe_verify_schannel:
auth_len 56.
May 10 08:48:38 casas-lin winbindd[1832]: [2010/05/10 08:48:38, 0]
libsmb/smb_signing.c:255(s
igning_good)
May 10 08:48:38 casas-lin winbindd[1832]: signing_good: BAD SIG: seq 23
May 10 08:52:25 casas-lin winbindd[1562]: [2010/05/10 08:52:25, 0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:52:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1832 is n
ot responding. Closing connection to it.
May 10 08:52:25 casas-lin winbindd[1832]: [2010/05/10 08:52:25, 0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)

---------
log.wb-CASAS (my domain is CASAS.WSU.EDU)
[2010/05/10 09:12:26, 1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
  ads_krb5_mk_req: krb5_get_credentials failed for ad1$@CASAS (KDC
reply did not match expectations)
[2010/05/10 09:12:26, 1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
  cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: KDC
reply did not match expectations
[2010/05/10 09:12:26, 0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
  cli_pipe_verify_schannel: auth_len 56.
[2010/05/10 09:12:26, 1]
rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
  cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
0x00000721 received from host ad1.casas.wsu.edu!
-------
log-wb-CASAS.old (during "crashed state"):
[2010/04/19 08:17:23, 1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
  ads_krb5_mk_req: krb5_get_credentials failed for ad1$@CASAS (Cannot
resolve network address
for KDC in requested realm)
[2010/04/19 08:17:23, 1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
  cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: Cannot
resolve network address f
or KDC in requested realm
[2010/04/19 08:17:23, 0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
  cli_pipe_verify_schannel: auth_len 56.
[2010/04/19 08:17:23, 1]
rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
  cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
0x00000721 received from host ad1
.casas.wsu.edu!
------------
My configuration
------------
smb.conf
------------
[global]
        security = ads
        netbios name = casas-lin
        realm = CASAS.WSU.EDU
        workgroup = CASAS
        password server = ad1.casas.wsu.edu
        workgroup = CASAS
        idmap uid = 10000-20000
        idmap gid = 10000-20000
        idmap backend = rid:CASAS.WSU.EDU=10000-20000
        winbind enum users = yes
        winbind enum groups = yes
        winbind use default domain = yes
        #template homedir = /home/%U
        template homedir = /net/files/home/%U
        template shell = /bin/bash
; client use spnego = yes
        domain master = no
--------------
/etc/krb5.conf
-------------
[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = CASAS.WSU.EDU
 dns_lookup_realm = false
 dns_lookup_kdc = true
 ticket_lifetime = 24h
 forwardable = yes

[realms]
 EXAMPLE.COM = {
  kdc = kerberos.example.com:88
  admin_server = kerberos.example.com:749
  default_domain = example.com
 }

 CASAS.WSU.EDU = {
  kdc = ad1.casas.wsu.edu
  admin_server = ad1.casas.wsu.edu
  kdc = ad1.casas.wsu.edu
 }

 CASAS = {
  kdc = ad1.casas.wsu.edu
  admin_server = ad1.casas.wsu.edu
  kdc = ad1.casas.wsu.edu
 }

[domain_realm]
 .example.com = EXAMPLE.COM
 example.com = EXAMPLE.COM

 casas.wsu.edu = CASAS.WSU.EDU
 .casas.wsu.edu = CASAS.WSU.EDU
[appdefaults]
 pam = {
   debug = false
   ticket_lifetime = 36000
   renew_lifetime = 36000
   forwardable = true
   krb4_convert = false
 }
---------------
/etc/pam.d/common-account
---------------
account [success=1 new_authtok_reqd=done default=ignore] pam_unix.so
account requisite pam_deny.so
account required pam_permit.so
account sufficient pam_winbind.so
account required pam_krb5.so minimum_uid=1000
------------
/etc/pam.d/common-auth
------------
auth [success=3 default=ignore] pam_winbind.so krb5_auth krb5_ccache_type=FILE
auth [success=2 default=ignore] pam_krb5.so minimum_uid=1000 try_first_pass
auth [success=1 default=ignore] pam_unix.so nullok_secure try_first_pass
auth requisite pam_deny.so
auth required pam_permit.so
------------
/etc/pam.d/common-password
------------
password requisite pam_winbind.so
password requisite pam_krb5.so minimum_uid=1000 use_authtok
password [success=1 default=ignore] pam_unix.so obscure use_authtok
try_first_pass sha512
password requisite pam_deny.so
password required pam_permit.so
password optional pam_gnome_keyring.so
-------------
/etc/nsswitch.conf
-------------
passwd: compat winbind
group: compat winbind
shadow: compat

hosts: files dns mdns4
networks: files

protocols: db files
services: db files
ethers: db files
rpc: db files

netgroup: nis
----------------

Thanks!
--Jim
-- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba