Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Jul 2009 08:55:46 -0700 (PDT)
From:      Richard Mahlerwein <mahlerrd@yahoo.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: VMWare ESX and FBSD 7.2 AMD64 guest
Message-ID:  <571888.63015.qm@web51010.mail.re2.yahoo.com>

next in thread | raw e-mail | index | archive | help
> From: John Nielsen <lists@jnielsen.net>=0A> Subject: Re: VMWare ESX and F=
BSD 7.2 AMD64 guest=0A> To: freebsd-questions@freebsd.org=0A> Cc: "Steve Be=
rtrand" <steve@ibctech.ca>=0A> Date: Friday, July 24, 2009, 10:22 AM=0A> On=
 Thursday 23 July 2009 19:44:15=0A> Steve Bertrand wrote:=0A> > This messag=
e has a foot that has nearly touched down=0A> over the OT=0A> > borderline.=
=0A> >=0A> > We received an HP Proliant DL360G5 collocation box=0A> yesterd=
ay that has=0A> > two processors, and 8GB of memory.=0A> >=0A> > All the cl=
ient wants to use this box for is a single=0A> instance of Windows=0A> > we=
b hosting. Knowing the sites the client wants to=0A> aggregate into IIS, I=
=0A> > know that the box is far over-rated.=0A> >=0A> > Making a long story=
 short, they have agreed to allow=0A> us to put their=0A> > Windows server =
inside of a virtual-ized container, so=0A> we can use the=0A> > unused hors=
epower for other vm's (test servers etc).=0A> >=0A> > My problem is perform=
ance. I'm only willing to make=0A> this box virtual if=0A> > I can keep the=
 abstraction performance loss to <25%=0A> (my ultimate goal=0A> > would be =
15%).=0A> >=0A> > The following is what I have, followed by my benchmark=0A=
> findings:=0A> >=0A> > # 7.2-RELEASE AMD64=0A> >=0A> > FreeBSD 7.2-RELEASE=
 #0: Fri May=A0 1 07:18:07 UTC=0A> 2009=0A> >=A0 =A0=A0=A0root@driscoll.cse=
.buffalo.edu:/usr/obj/usr/src/sys/GENERIC=0A> >=0A> > Timecounter "i8254" f=
requency 1193182 Hz quality 0=0A> > CPU: Intel(R) Xeon(R) CPU=A0 =A0 =A0 =
=A0=0A> =A0 =A0 5150=A0 @ 2.66GHz (2666.78-MHz=0A> > K8-class CPU)=0A> >=A0=
=A0=A0Origin =3D "GenuineIntel"=A0 Id =3D=0A> 0x6f6=A0 Stepping =3D 6=0A> >=
=0A> > usable memory =3D 8575160320 (8177 MB)=0A> > avail memory=A0 =3D 827=
3620992 (7890 MB)=0A> >=0A> > FreeBSD/SMP: Multiprocessor System Detected: =
4 CPUs=0A> >=A0 cpu0 (BSP): APIC ID:=A0 0=0A> >=A0 cpu1 (AP): APIC ID:=A0 1=
=0A> >=A0 cpu2 (AP): APIC ID:=A0 6=0A> >=A0 cpu3 (AP): APIC ID:=A0 7:=0A> =
=0A> Did you give the VM 4 virtual processors as well? How much=0A> RAM did=
 it have? =0A> What type of storage does the server have? Did the VM just=
=0A> get a .vmdk on =0A> VMFS? What version of ESX?=0A> =0A> > Benchmarks:=
=0A> >=0A> > # time make -j4 buildworld (under vmware)=0A> >=0A> > 5503.038=
u 3049.500s 1:15:46.25=0A> 188.1%=A0=A0=A05877+1961k 3298+586716io 2407pf+0=
w=0A> >=0A> > # time make -j4 buildworld (native)=0A> >=0A> > 4777.568u 992=
.422s 33:02.12 291.1%=A0=A0=A0=0A> 6533+2099k 25722+586485io 3487pf+0w=0A> =
=0A> Note that the "user" time is within your 15% margin (if you=0A> round =
to the =0A> nearest percent). The system time is what's running away.=0A> M=
y guess is that =0A> that is largely due to disk I/O and virtualization of =
same.=0A> What you can do =0A> to address this depends on what hardware you=
 have. Giving=0A> the VM a raw =0A> slice/LUN/disk instead of a .vmdk file =
may improve matters=0A> somewhat. If you =0A> do use a disk file be sure th=
at it lives on a stripe (or=0A> whatever unit is =0A> relevant) boundary of=
 the underlying storage. Ways to do=0A> that (if any) depend =0A> on the st=
orage. Improving the RAID performance, etc. of the=0A> storage will =0A> im=
prove your benchmark overall, and may or may not narrow=0A> the divide.=0A>=
 =0A> The (virtual) storage driver (mpt IIRC) might have some=0A> parameter=
s you could =0A> tweak, but I don't know about that off the top of my head.=
=0A> =0A> > ...both builds were from the exact same sources, and=0A> both r=
uns were=0A> > running with the exact same environment. I was=0A> extremely=
 careful to=0A> > ensure that the environments were exactly the same.=0A> >=
=0A> > I'd appreciate any feedback on tweaks that I can make=0A> (either to=
 VMWare,=0A> > or FreeBSD itself) to make the virtualized environment=0A> m=
uch more efficient.=0A> =0A> See above about storage. Similar questions com=
e up=0A> periodically; searching the =0A> archives if you haven't already m=
ay prove fruitful. You may=0A> want to try =0A> running with different kern=
el HZ settings for instance.=0A> =0A> I would also try to isolate the perfo=
rmance of different=0A> components and =0A> evaluate their importance for y=
our actual intended load.=0A> CPU and RAM probably =0A> perform like you ex=
pect out of the box. Disk and network=0A> I/O won't be as =0A> close to nat=
ive speed, but the difference and the impact=0A> are variable =0A> dependin=
g on your hardware and load.=0A> =0A> A lightly-loaded Windows server is th=
e poster child of=0A> virtualization =0A> candidates. If your decision is t=
o dedicate the box to=0A> Winders or to =0A> virtualize and use the excess =
capacity for something else I=0A> would say it's a =0A> no-brainer if the c=
ost of ESX isn't a factor (or if ESXi=0A> gives you similar =0A> performanc=
e). If that's already a given and your decision=0A> is between running =0A>=
 a specific FreeBSD instance on the ESX host or on its own=0A> hardware the=
n =0A> you're wise to spec out the performance differences.=0A> =0A> HTH,=
=0A> =0A> JN=0A=0AIf I recall correctly from ESX (well, VI) training*, ther=
e may be a minor scheduling issue affecting things here.  If you set up the=
 VM with 4 processors, ESX schedules time on the CPU only when there's 4 th=
ings to execute (well, there's another time period it also uses, so even a =
single thread will get run eventually, but anyway...).  The physical instan=
ce will run one thread immediately even if there's nothing else waiting, wh=
ereas the VM will NOT execute a single thread necessarily immediately.  I w=
ould retry using perhaps -j8 or even -j12 to make sure the 4 CPUs see plent=
y of work to do and see if the numbers don't slide closer to one another.  =
=0A=0AFor what it's worth, if there were a raw LUN available and made avail=
able to the VM, the disk performance of that LUN should very nearly match n=
ative performance, because it IS native performance.  VMWare (if I understo=
od right in the first place and remember correctly as well, I supposed I sh=
ould * this as well. :) ) doesn't add anything to slow that down.  Plugging=
 in a USB drive to the Host and making it available to the guest would also=
 be at native USB/drive speeds, assuming you can do that (I've never tried =
to use USB drives on our blade center!).=0A=0A-Rich=0A=0A*Since I'm recalli=
ng it, the standard caveats about my bad memory apply.  In this case, there=
's also the caveats about the VI instructor's bad memory, too.  :)=0A=0A=0A=
      



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?571888.63015.qm>