Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 06 Jul 2011 12:23:39 +1000
From:      "Peter Ross" <Peter.Ross@bogen.in-berlin.de>
To:        "Jeremy Chadwick" <freebsd@jdc.parodius.com>
Cc:        freebsd-stable List <freebsd-stable@freebsd.org>, Scott Sipe <cscotts@gmail.com>
Subject:   Re: scp: Write Failed: Cannot allocate memory
Message-ID:  <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de>

next in thread | raw e-mail | index | archive | help
Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>:

> On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote:
>> I'm running virtualbox 3.2.12_1 if that has anything to do with it.
>>
>> sysctl vfs.zfs.arc_max: 6200000000
>>
>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is hovering =20
>> right around that value, sometimes above, sometimes below (that's =20
>> as it should be, right?). I don't think that it dies when crossing =20
>> over arc_max. I can run the same scp 10 times and it might fail 1-3 =20
>> times, with no correlation to the arcstats.size being above/below =20
>> arc_max that I can see.
>>
>> Scott
>>
>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote:
>>
>>> Hi all,
>>>
>>> just as an addition: an upgrade to last Friday's FreeBSD-Stable =20
>>> and to VirtualBox 4.0.8 does not fix the problem.
>>>
>>> I will experiment a bit more tomorrow after hours and grab some statisti=
cs.
>>>
>>> Regards
>>> Peter
>>>
>>> Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de>:
>>>
>>>> Hi all,
>>>>
>>>> I noticed a similar problem last week. It is also very similar to =20
>>>> one reported last year:
>>>>
>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708=
.html
>>>>
>>>> My server is a Dell T410 server with the same bge card (the same =20
>>>> pciconf -lvc output as described by Mahlon:
>>>>
>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711=
.html
>>>>
>>>> Yours, Scott, is a em(4)..
>>>>
>>>> Another similarity: In all cases we are using VirtualBox. I just =20
>>>> want to mention it, in case it matters. I am still running =20
>>>> VirtualBox 3.2.
>>>>
>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching =20
>>>> vfs.zfs.arc_max then, but I could catch one or two cases then the =20
>>>> value was still below.
>>>>
>>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not hel=
p.
>>>>
>>>> BTW: It looks as ARC only gives back the memory when I destroy =20
>>>> the ZFS (a cloned snapshot containing virtual machines). Even if =20
>>>> nothing happens for hours the buffer isn't released..
>>>>
>>>> My machine was still running 8.2-PRERELEASE so I am upgrading.
>>>>
>>>> I am happy to give information gathered on old/new kernel if it helps.
>>>>
>>>> Regards
>>>> Peter
>>>>
>>>> Quoting "Scott Sipe" <cscotts@gmail.com>:
>>>>
>>>>>
>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote:
>>>>>
>>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote:
>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote:
>>>>>>>> I'm running 8.2-RELEASE and am having new problems with scp. =20
>>>>>>>> When scping
>>>>>>>> files to a ZFS directory on the FreeBSD server -- most =20
>>>>>>>> notably large files
>>>>>>>> -- the transfer frequently dies after just a few seconds. In =20
>>>>>>>> my last test, I
>>>>>>>> tried to scp an 800mb file to the FreeBSD system and the =20
>>>>>>>> transfer died after
>>>>>>>> 200mb. It completely copied the next 4 times I tried, and =20
>>>>>>>> then died again on
>>>>>>>> the next attempt.
>>>>>>>>
>>>>>>>> On the client side:
>>>>>>>>
>>>>>>>> "Connection to home closed by remote host.
>>>>>>>> lost connection"
>>>>>>>>
>>>>>>>> In /var/log/auth.log:
>>>>>>>>
>>>>>>>> Jul  1 14:54:42 freebsd sshd[18955]: fatal: Write failed: =20
>>>>>>>> Cannot allocate
>>>>>>>> memory
>>>>>>>>
>>>>>>>> I've never seen this before and have used scp before to =20
>>>>>>>> transfer large files
>>>>>>>> without problems. This computer has been used in production =20
>>>>>>>> for months and
>>>>>>>> has a current uptime of 36 days. I have not been able to =20
>>>>>>>> notice any problems
>>>>>>>> copying files to the server via samba or netatalk, or any problems =
in
>>>>>>>> apache.
>>>>>>>>
>>>>>>>> Uname:
>>>>>>>>
>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 =20
>>>>>>>> 01:02:54 EST
>>>>>>>> 2011     root@xeon:/usr/obj/usr/src/sys/GENERIC  amd64
>>>>>>>>
>>>>>>>> I've attached my dmesg and output of vmstat -z.
>>>>>>>>
>>>>>>>> I have not restarted the sshd daemon or rebooted the computer.
>>>>>>>>
>>>>>>>> Am glad to provide any other information or test anything else.
>>>>>>>>
>>>>>>>> {snip vmstat -z and dmesg}
>>>>>>>
>>>>>>> You didn't provide details about your networking setup (rc.conf,
>>>>>>> ifconfig -a, etc.).  netstat -m would be useful too.
>>>>>>>
>>>>>>> Next, please see this thread circa September 2010, titled "Network
>>>>>>> memory allocation failures":
>>>>>>>
>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thr=
ead.html#58708
>>>>>>>
>>>>>>> The user in that thread is using rsync, which relies on scp by defau=
lt.
>>>>>>> I believe this problem is similar, if not identical, to yours.
>>>>>>>
>>>>>>
>>>>>> Please also provide your output of ( /usr/bin/limits -a ) for the ser=
ver
>>>>>> end and the client.
>>>>>>
>>>>>> I am not quite sure I agree with the need for ifconfig -a but some
>>>>>> information about the networking driver your using for the interface
>>>>>> would be helpful, uptime of the boxes. And configuration of the pool.
>>>>>> e.g. ( zpool status -a ;zfs get all <poolname> ) You should probably
>>>>>> prop this information up somewhere so you can reference by URL whenev=
er
>>>>>> needed.
>>>>>>
>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made =
to
>>>>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is
>>>>>> stating here but correct me if I am wrong. It does use ssh(1) by
>>>>>> default.
>>>>>>
>>>>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp
>>>>>> type filesystems that rsync(1) may be just filling up your temp ram a=
rea
>>>>>> and causing the connection abort which would be expected. ( df =20
>>>>>> -h ) would
>>>>>> help here.
>>>>>
>>>>> Hello,
>>>>>
>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 =20
>>>>> different OSX computers (over gigabit). The FreeBSD server has =20
>>>>> 12gb of ram and no bce adapter. For what it's worth, the server =20
>>>>> is backed up remotely every night with rsync (remote FreeBSD =20
>>>>> uses rsync to pull) to an offsite (slow cable connection) =20
>>>>> FreeBSD computer, and I have not seen any errors in the nightly =20
>>>>> rsync.
>>>>>
>>>>> Sorry for the omission of networking info, here's the output of =20
>>>>> the requested commands and some that popped up in the other =20
>>>>> thread:
>>>>>
>>>>> http://www.cap-press.com/misc/
>>>>>
>>>>> In rc.conf:  ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0"
>>>>>
>>>>> Scott
>
> Just to make it crystal clear to everyone:
>
> There is no correlation between this problem and use of ZFS.  People are
> attempting to correlate "cannot allocate memory" messages with "anything
> on the system that uses memory".  The VM is much more complex than that.
>
> Given the nature of this problem, it's much more likely the issue is
> "somewhere" within a networking layer within FreeBSD, whether it be
> driver-level or some sort of intermediary layer.
>
> Two people who have this issue in this thread are both using VirtualBox.
> Can one, or both, of you remove VirtualBox from the configuration
> entirely (kernel, etc. -- not sure what is required) and then see if the
> issue goes away?

On the machine in question I only can do it after hours so I will do =20
it tonight.

I was _successfully_ sending the file over the loopback interface using

cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null"

I did it, btw, with the IPv6 localhost address first (accidently), and =20
then using IPv4. Both worked.

It always fails if I am sending it through the bce(4) interface, even =20
if my target is the VirtualBox bridged to the bce card (so it does not =20
"leave" the computer physically).

Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and kldstat output=
.

I have another box where I do not see that problem. It copies files =20
happily over the net using ssh.

It is an an older HP ML 150 with 3GB RAM only but with a bge(4) driver =20
instead. It runs the same last week's RELENG_8. I installed VirtualBox =20
and enabled vboxnet (so it loads the kernel modules). But I do not run =20
VirtualBox on it (because it hasn't enough RAM).

Regards
Peter

DellT410one# uname -a
FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun =20
30 17:07:18 EST 2011     =20
root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC  amd64
DellT410one# ifconfig -a
bce0: flags=3D8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> =20
metric 0 mtu 1500
=09options=3Dc01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCS=
UM,TSO4,VLAN_HWTSO,LINKSTATE>
=09ether 84:2b:2b:68:64:e4
=09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255
=09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255
=09media: Ethernet autoselect (1000baseT <full-duplex>)
=09status: active
bce1: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
=09options=3Dc01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCS=
UM,TSO4,VLAN_HWTSO,LINKSTATE>
=09ether 84:2b:2b:68:64:e5
=09media: Ethernet autoselect
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
=09options=3D3<RXCSUM,TXCSUM>
=09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb
=09inet6 ::1 prefixlen 128
=09inet 127.0.0.1 netmask 0xff000000
=09nd6 options=3D3<PERFORMNUD,ACCEPT_RTADV>
vboxnet0: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
=09ether 0a:00:27:00:00:00
DellT410one# netstat -rn
Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use  Netif Expire
default            192.168.50.201     UGS         0    52195   bce0
127.0.0.1          link#11            UH          0        6    lo0
192.168.50.0/24    link#1             U           0  1118212   bce0
192.168.50.219     link#1             UHS         0     9670    lo0
192.168.50.220     link#1             UHS         0     8347    lo0
192.168.50.221     link#1             UHS         0   103024    lo0
192.168.50.223     link#1             UHS         0    43614    lo0
192.168.50.224     link#1             UHS         0     8358    lo0
192.168.50.225     link#1             UHS         0     8438    lo0
192.168.50.226     link#1             UHS         0     8338    lo0
192.168.50.227     link#1             UHS         0     8333    lo0
192.168.165.0/24   192.168.50.200     UGS         0     3311   bce0
192.168.166.0/24   192.168.50.200     UGS         0      699   bce0
192.168.167.0/24   192.168.50.200     UGS         0     3012   bce0
192.168.168.0/24   192.168.50.200     UGS         0      552   bce0

Internet6:
Destination                       Gateway                       Flags  =20
     Netif Expire
::1                               ::1                           UH     =20
      lo0
fe80::%lo0/64                     link#11                       U      =20
      lo0
fe80::1%lo0                       link#11                       UHS    =20
      lo0
ff01::%lo0/32                     fe80::1%lo0                   U      =20
      lo0
ff02::%lo0/32                     fe80::1%lo0                   U      =20
      lo0
DellT410one# kldstat
Id Refs Address            Size     Name
  1   19 0xffffffff80100000 dbf5d0   kernel
  2    3 0xffffffff80ec0000 4c358    vboxdrv.ko
  3    1 0xffffffff81012000 131998   zfs.ko
  4    1 0xffffffff81144000 1ff1     opensolaris.ko
  5    2 0xffffffff81146000 2940     vboxnetflt.ko
  6    2 0xffffffff81149000 8e38     netgraph.ko
  7    1 0xffffffff81152000 153c     ng_ether.ko
  8    1 0xffffffff81154000 e70      vboxnetadp.ko
DellT410one# pciconf -lv
..
bce0@pci0:1:0:0:        class=3D0x020000 card=3D0x028d1028 chip=3D0x163b14e4=
 =20
rev=3D0x20 hdr=3D0x00
     vendor     =3D 'Broadcom Corporation'
     class      =3D network
     subclass   =3D ethernet
bce1@pci0:1:0:1:        class=3D0x020000 card=3D0x028d1028 chip=3D0x163b14e4=
 =20
rev=3D0x20 hdr=3D0x00
     vendor     =3D 'Broadcom Corporation'
     class      =3D network
     subclass   =3D ethernet




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110706122339.61453nlqra1vqsrv>