Code Answer: 01/29/11

Saturday, January 29, 2011

Is it safe upgrading php through Testing Repository (CentOS)

Hello,

I need to upgrade PHP to 5.2.x

Im referring here as how to do upgrade it:
http://wiki.centos.org/HowTos/PHP_5.1_To_5.2 http://www.securityhacking.tk/2010/02/install-upgrade-php-5-1-to-5-2-centos-5-4/

But that was testing repository. Is it safe to use it on live server?

Thank you.

From serverfault $pageUsers[$entry.posts[0].author].name

The answer is on the CentOS site

...it should not be left enabled or used on production systems...

So that's that.

However due to the age of PHP 5.1.6 and the expectations of customers I can see why you may want to do the upgrade anyway.

From my experience I have a server running 5.2.10 from the testing repository. There have been no problems in using this version from Testing.

Obviously I cannot provide any assurance it won't break on your setup but you can reduce any disruption by enabling rollbacks in Yum. To do this add
```
tsflags=repackage
```
to /etc/yum.conf before enabling the repository and performing the upgrade.

Then if anything breaks you can rollback. There is an example of how to do this here.

So the steps involved are:
1. Edit /etc/yum.conf, add tsflags=repackage to enable rollback.
2. Add the CentOS Testing Repository . See the instructions here.
3. Upgrade PHP and only PHP by doing yum update php --enablerepo c5-testing. (Dependencies will be updated too, such as php-gd if installed.)
4. Test your PHP applications thoroughly and check log files for unseen problems.
5. Rollback if necessary.
6. Tidy up /etc/yum.conf
The Testing Repository should be left disabled, by setting enabled=0 in /etc/yum.repos.d/CentOS-Testing.repo to avoid accidentally updating httpd or any other critical applications with newer versions in Testing.

This means you will have to manually check for updates to PHP regularly by using
```
yum check-update php --enablerepo=c5-testing
```
The bottom line is that it appears to work okay, but if you break anything you get to keep the pieces so you best have backups.

From Richard Holloway
Maybe check the atomic turtle repository. They wrote plesk and these days focus on atomic secured Linux. Recent PHP rpms are available on their site and updated regularily.

Richard Holloway : There are many places you can get newer versions of PHP as RPM or you can install from source. The question asked here is whether it is reliable for a production server. The CentOS view is "No stick 5.1.6 shipped by default". I saw the atomic turtle repository in my search for a solution for me. Do you run this version of PHP from this repository and is it more reliable that CentOS Testing?

Imo : I run this (Atomic RPMs) in a production environment and have not experienced any problems with it. Not used CentOS testing so cannot compare. Sorry.

From $pageUsers[$post.author].name
Jason Litka provides 3rd-party RPMs for php: http://www.jasonlitka.com/yum-repository/

Safety is always a sliding scale. Clearly you are already running Centos, meaning you are probably running without a support contract of any kind and you have deemed this to be safe. I'm going to assume that when you hit problems you either try and debug yourself or you ask the community for assistance, and go from there. This is already "unsafe" from a bank/large corporation perspective.

Sticking with either the Centos testing RPMs or Jason Litka's will place you more at risk of problems but there are still a significant portion of the user-base who are running this way. A newer version of PHP may have more security bugs, while it may contain some fixes. Redhat/Centos are sometimes lax in backporting fixes properly (which was why they got hacked - they didn't backport a kernel fix not marked as a security risk).

Going down the Centos testing path may require you to upgrade glibc, which could then mean you have to start upgrading other packages you never intended to touch. For this reason I would recommend Jason's repository.

Please make sure all additional repositories remained disabled (enabled=0) and you explicitly enable them on the yum command-line or using your configuration management tool

yum --enablerepo=jason install php

Richard Holloway : For reference, you can upgrade only php (and some php dependencies like php-gd if installed) from Testing without having to upgrade a bunch of other stuff.

From mechcow

Is there a Cisco IOS Virtual Appliance?

Need to do test configurations on (unfamiliar) Cisco/IOS equipment. Is there a virtual machine I can light up and use it in my test environment as a real firewall/edge/core router?

From serverfault gravyface

Have a look at GNS3 it's a nice GUI frontend to Dynamips which is a IOS router simulator.

Kyle Brandt : Might give GNS3 a try again but gave me a bunch of trouble last time I used it, now I just use dynamips with dynagen.

From 3dinfluence
The only Cisco equipment emulator I know of is Dynamips/Dynagen, but its purpose is learning Cisco IOS commands for certification exams, not testing actual networking setups. While you could certainly do that, the performance would be likely very bad. Even connecting two routers on the same machine eats a lot of CPU, and you have to play around to find which idleCPU value works for the image you are using, to get lower CPU usage, when the router is idle. Otherwise even with idle routers you get high CPU usage.

This is in contrast to Juniper Olives, which have quite good performance.

gravyface : wish I could mark you both right, so I +1'ed 3dinfluence and gave you the correct answer. You need the points more than him anyways. :)

Kyle Brandt : Performance is a good point, I am testing some QoS/rate-limiting setups and can't push more than ~5M through one of them. So I am just scaling down my 10M lab to a 1M lab by dropping a digit :-)

Prof. Moriarty : @gravyface LOL, funniest most thoughtful comment I read so far. Thanks!

From Prof. Moriarty

Why does eth0 show an IP if I'm booting into runlevel 1?

I'm having some issues with networking on a new Linux server I'm building. The OS is SLES 11. When booting into runlevel 1, I see that eth0 is showing an IP. Physically, there is a network cable plugged into the card associated with eth1, and then there is a network cable plugged into a QLogic iSCSI card (eth4, not shown). I've been troubleshooting this for awhile, and it seems like eth0 is somehow getting assigned an IP, even though it isn't configured in Linux or even plugged into the network for that matter. Thoughts?

ifconfig -a

Here is the ifconfig output

(Sorry, I need more rep before I can post images on SF...)

From serverfault Banjer

The runlevels are completely configurable from your inittab, so it's possible that runlevel 1 is mapped to a mode with networking activated.

Does look odd though. Try booting into runlevel 2: in Suse that should default to multiuser/no networking.

From Satanicpuppy

What would cause JBoss to take up 4 more GB of memory than what is allocated?

Relevant part of the start up line: java -server -Xms10G -Xmx10G -XX:PermSize=1G -XX:MaxPermSize=1G

This instance ended up taking up 16GB of memory and 10GB of swap before killing the server.

Any ideas on what could cause that?

This is the only major application running on a RedHat system with 16GB RAM and 10GB swap.

From serverfault Ichorus

My Guess would be runaway threads maybe, really probably better stack overflow on how to profile memory usage for a Java App and look for memory leaks.

Keep in mind that the options -Xms and -Xmx are for heap only. There are other things that take up memory such as thread stacks etc. So maybe there were run away threads?

You might want to look into ulimit for limit what this application can take, and start using something like Nagios to alert you when memory usage gets out of control. You running the jvm as root?

Ichorus : "You running the jvm as root?" No. We found the answer and it is scary. One detail that I left out is that we are running on a virtualized server on VMWare. Originally we defined the memory to 8GB in VMWare. We changed it to 16GB when we decided to move that one into a production cluster. Apparently, VMWare, despite showing that there were 16GB available, really only had the original 8 (corraborated by sar data leading up to the crash). When we hit that limit, it went into swap, used that up completely and then, of course, the instance died. We are talking to VMWare now about it.

djangofan : Run VirtualVM against that server and tell us which type of memory is getting out of hand in your VM.

From Kyle Brandt

How long do managed gigabit ethernet switches take to boot up?

One critical drawback that I have found in researching managed-switches, and one that I have some past experience with is that anything with "lots" of firmware is going to have lots of issues associated with that firmware.

We are in the middle of researching rackmount gigabit switches (48 port). It looks like for 48 ports, our only choice is managed switches (Dell, Cisco/Linksys,HP, etc). What I want to know, that I can not find out much about is the boot-time for various managed switches.

If you own one, can you please answer with the model number, and the cold boot time in seconds. I have read online that Linksys (now Cisco) SRW series sometimes take almost 5 minutes before they are fully booted up, and that is an unacceptable cost for us.

I particularly want to know about Dell PowerConnect managed switch bootup time (model 3548 and 5448), and would like to confirm the 5-minute boot time on the SRW2048 or similar model, and any HP ProCurve boot up times.

The composite of all those figures ought to form an interesting overall picture of boot-up times on managed switches.

[UPDATE: Further to those who think I am asking about boot-up time because I am silly enough to think that has anything to do with the actual operational performance, I have updated the above, to make it more clear that I'm interested in understanding the norms of this hardware type, not in forming an overall impression on switch performance based on one edge-case of boot time. Thanks for your time.]

[UPDATE2: I'm going to add my own answer for the managed SRW switch that we bought yesterday, a Cisco (former-linksys) model ... Is there anything wrong with not accepting AN ANSWER On this? I'd like to keep this question open to collect data points which might be useful to others, as well as to myself. In general, the longest time is 5 minutes, and the shortest are 1-2 minutes, with a nifty exception for the one HP ProCurve mentioned, which is super fast. ].

From serverfault Warren P

I can't imagine a reason why you would be rebooting switches often enough in any environment to even worry about this. Any reboot of a switch should be done in a maintenance window and then a few minutes isn't going to be a big deal.

I'm not sure how you think that booting time reflects the switch performance. Switches, like most embedded devices, will have an underpowered CPU of some sort which is responsible for the booting process and maybe a few functions such as running the cli or web interface. But almost all of the networking functions are going to be handled by purpose built ASICs and won't involve the CPU at all.

Zypher : +1 started writing the same thing, then got distracted

Dan : +1 I agree, why is switch boot time so important? Any/all planned downtime is just that, planned.

Warren P : Unplanned happens all the time. We had switch failures here last week. You just need one day where you have multiple switch problems, and you have to re-route the whole office network, and you start to care about little things like this. Because it's 5 minutes PER cold boot. And on a day when you had 10 of them, it's annoying.

3dinfluence : Fair enough but it's been my experience that outages due to a switch failure is very rare, but it does happen. If you had to reboot a switch 10 times in a day then boot time isn't going to change the disruption drastically. The end result is going to be an up and down network resulting in lost productivity if we're talking end users. Would you rather a switch that takes 5 minutes to boot but would have fixed the problem in 1 reboot or a switch that takes 3 minutes to boot but took 5 reboots to work out your issues. I'm just saying that boot time may not be the win you're looking for.

Farseeker : Agree with everything you wrote, but -1 cos it's not what the OP asked for (don't worry I gave you a +1 on your other answer so you're still 8 rep ahead!)

From 3dinfluence
SRW2048 from a cold start running 1.2.1, 97 seconds
```
tsavo:~ mcd$ date
Mon Apr 12 14:04:48 EDT 2010
tsavo:~ mcd$ ping 192.168.24.70
PING 192.168.24.70 (192.168.24.70): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2

... snipped ...

Request timeout for icmp_seq 85
64 bytes from 192.168.24.70: icmp_seq=86 ttl=64 time=45.284 ms
^C

tsavo:~ mcd$ date
Mon Apr 12 14:06:25 EDT 2010
```
Warren P : Thanks for providing what I asked for. Lots of people can't understand why measuring performance is even important. An unmanaged switch is back online in very little time. The time it takes a managed switch to boot up is something that the network admins need to take into account. It may not happen that often, but when you have people asking "when will the system be back up", it's unexpected to have to say "well the server takes 3 minutes to boot, but our switch takes 5 minutes".

kmarsh : +1 for actually answering the question, instead of questioning the question. While I initially had the same "why" reaction, I suddenly realized that there are many systems that have contractual uptime requirements and penalties.

3dinfluence : @kmarsh If there are uptime requirements such as a SLA then the network needs to be designed with that in mind. That's not always possible at the edge of a corporate network but if you keep the edge switches to 24 ports the risk on affecting productivity can be minimized. The chassis based switches that you'll find at the core of most larger networks deal with this type of stuff quite well. With multiple hotswap PSU's and controller modules. But like you said in your comment you can also do things at the network layer w/ RSTP/PVST, dynamic routing protocols, and ethernet bonding.

From $pageUsers[$post.author].name

Ok here's another data point for you from a PowerConnect 5324. Which is a few generations behind the models you're looking at. So take it for what it's worth.

So the ping command below was sending 1 ping per second to you can see from the output below that it took 108 seconds from the point where it went down from the reload command to the point that it started replying again.

PowerConnect 5324 reboot 108 seconds

date && ping 192.168.0.2 && date
Thu Apr 15 00:06:45 EDT 2010
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.
64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=2.53 ms
64 bytes from 192.168.0.2: icmp_seq=2 ttl=64 time=2.54 ms
64 bytes from 192.168.0.2: icmp_seq=3 ttl=64 time=2.55 ms
64 bytes from 192.168.0.2: icmp_seq=4 ttl=64 time=2.60 ms
64 bytes from 192.168.0.2: icmp_seq=5 ttl=64 time=2.55 ms
64 bytes from 192.168.0.2: icmp_seq=6 ttl=64 time=2.76 ms
64 bytes from 192.168.0.2: icmp_seq=7 ttl=64 time=2.50 ms
64 bytes from 192.168.0.2: icmp_seq=8 ttl=64 time=2.63 ms
64 bytes from 192.168.0.2: icmp_seq=9 ttl=64 time=3.51 ms
....
64 bytes from 192.168.0.2: icmp_seq=117 ttl=64 time=2026 ms
64 bytes from 192.168.0.2: icmp_seq=118 ttl=64 time=1028 ms
64 bytes from 192.168.0.2: icmp_seq=119 ttl=64 time=30.1 ms
64 bytes from 192.168.0.2: icmp_seq=120 ttl=64 time=3.80 ms
^C
--- 192.168.0.2 ping statistics ---
120 packets transmitted, 13 received, +45 errors, 89% packet loss, time 119202ms
rtt min/avg/max/mdev = 2.502/239.520/2026.970/583.213 ms, pipe 4
Thu Apr 15 00:08:45 EDT 2010

Warren P : That's good to know. If the older generations are under 2 minutes, surely the latest power connects are also under 2 minutes.

From 3dinfluence

I don't have the exact times on hand, but we have both Cisco (3750) and HP switches (2524 & 2510G). The Cisco ones indeed take several minutes to start up. The HP ones take about 30 seconds. The HP ones are 24 port, and it tests each port (does about 4 ports per second), so a 48 port would take slightly longer.

Warren P : Thanks. The Cisco 3750 is a catalyst/ios series right? The ones I was originally asking about are the former Linksys now rebranded as "cisco" small business switches and are non-ios non-catalyst.

Chris S : Yeah, the 3750 is a IOS based device. I think all the Catalyst devices have been phased out now, but I'm no expert.

From Chris S

Strategy to allow emergency access to colocation crew

I'm setting up a server at a new colocation center half way around the world. They installed the OS for me and sent me the root password, so there's obviously a great amount of trust in them.

However, I'm pretty sure I don't want them to have my root password on a regular basis. And anyway, I intend to only allow key-based login.

On some cases, though, it might be useful to let their technical support log in through a physical terminal. For example, if I somehow mess up the firewall settings.

Should I even bother worrying about that?
Should I set up a sudoer account with a one-time password that will change if I ever use it?
Is there a common strategy for handling something like this?

From serverfault itsadok

Well, a lot will obviously depend on the specifics of the case, but you should keep in mind that with physical access to the machine, they can practically do anything they want anyway.

The common solution for this is to give them a dedicated maintenance account that has root rights via sudo. Then you can give them the pw when you want them to have root access. If you want to take away root access, just change the pw on the maintenance account.

At any rate you can configure SSH to only allow key-based logins. Then the maintenance account + pw would only be usable for logins at the physical console (even if it is enabled), further restricting the access to the system (if you want to).

Ryaner : We do this and it works well. You can add command logging to the maintenance account too if you really want to look and see what has been done

From sleske
You should buy an option for access via IP-KVM. You will have access to everything, including single-user mode and BIOS.

kbyrd : How does that solve the security issues in the original question?

minaev : Actually, it does not eliminate them completely (sleske is absolutely right here, physical access makes all precautions useless), but mitigates to an acceptable level, when you know for sure that nobody besides you has any right to log in. Any attempt to log in is an intrusion.

From minaev
We got a similar setup for some external boxes. We keep the root-password secret and only give it out when it's needed, when done we change it. We do not allow root logins via ssh so the password is only relevant when you got physical access.

From rkthkr

Will wear induced by turning computers off in the evening be offset by energy savings?

I'm asking this here because this is primarily a huge office scenario and administrators will more likely have the answer I'm looking for.

Employees' desktop computers can be either left turned on for the whole night or switched off in the evening and turned back on in the morning. The latter will surely save energy. In the same time turning on and off is very harmful for the equipment - hardware often breaks specifically when turned on.

Both energy and hardware replacements cost money. With energy it's quite obvious - you pay every month according to what your power meter shows. With hardware replacements it's worse - you need qualified stuff to quickly diagnose the problems and once something breaks the affected employee will have to wait for some time while his computer is fixed/replaced and the data is recovered.

So the company has to choose between saving money on energy and saving money on computer maintaince and lost hours. Such decisions must be well though.

Is there any detailed study of how turning computers off each evening affects their lifetime and what losses are induced by it?

From serverfault sharptooth

Not switching PCs off may cause overheat which decrease hardware lifetime.

sharptooth : True, but if a computer has problems with heat it will incur enough damage even during a normal 9-hours day. I agree that 9-by-5 is much less then 24-by-7, but still I'm asking about normal functioning. +1 anyway - an important factor to consider.

pehrs : Actually a lot of desktop hardware is only rated for 9 hour cycles, not for 24/7... For that you are expected to buy "server" hardware.

From Dmitry Trukhanov
Sorry but I didn't get you where you say "hardware often breaks specifically when turned on." Does turning computers off and on damage them ?

Further, I would suggest having the machines to be put into suspend mode, if it is not possible to switch them off completely, so that at least some power is saved :)

sharptooth : Yes, turning any piece of electronics off and then on several times usually causes more damage to it than just running it for several minutes.

Knight Samar : Hmm...interesting, didn't ever think of that :) BTW, A google search took me here...maybe something in this maybe useful for you. http://www.federalelectronicschallenge.net/resources/docs/oandm.pdf

duffbeer703 : No, turning electronics on and off does not damage the hardware. (At least not in the context of energy management.... abusive behavior like rocking the power switch back and forth obviously isn't a good idea.)

From Knight Samar
First, if this is a "huge office scenario", you will hopefully have warranty contracts that covers most of the lifetime of the machines. In that case, if it breaks, it's the vendors job to repair it and you can just reap in the energy savings.

Beside that: While I would agree that there is a somewhat increased possibility that hardware will die during a power cycle, I consider this to be a problem of (really) old hardware and I can't see how one cycle every working day over a course of three to five years would cause a problem except on very crappy hardware that might die anyway whenever you look at it the wrong way.

One major issue remains, which decides the whole game in my opinion, and this is the harddisk. Desktop drives are not designed to run 24x7 and I personally experienced a significant amount of drive failures of non-raid-type drives used in 24x7 server scenarios.

So, in the end: Turn the machines off and save the energy. There is nothing else to gain.

sharptooth : I seriously doubt warranty will cover lost hours.

SvenW : Again, you are talking about a "huge office scenario". For me, this means having replacement machines available, automatic deployment etc. etc, with maybe a 30min time for replacing a dead machine. And as I said, keeping systems running actually increases the chance of a hardware failure. So, in my book there are no two choices between saving energy and saving money on maintenance and lost hours, because you get both by turning machines off.

duffbeer703 : If you're in a "huge" office and have dedicated IT staff, a PC failure should not be an event that takes hours to recover from. You should be seeing 3-5% failure rates already, and have procedures for dealing with them.

From SvenW
How much does it cost in power to run the PCs 24/7? This is a question most companies can't answer off the bat. However, it should be considered that of 168 hours in a week, each PC will be used around 40 hours, or however long your working week is (at most, there's still lunch break and not everybody does all work on the PC all the time). This means you quadruple energy consumption by not switching them off.

My office desktop draws 250 W, so I use 1 kWh every four hours, or 42 kWh per week, if I left it running all the time. If you pay $0,15 per kWh, this means more than $300 a year, of which $225 are just waste. At that rate, the damage done by a power cycle every weekday would have to decrease the MTBF quite drastically to get economically meaningful.

I switch my PCs and laptops off whenever I know I won't use them for the next 30-60 minutes. I have used them for many years that way. In fact, I have never even heard of a PC breaking by switching it on and off.

In working as an energy consultant for businesses, I've seen this theme pop up over and over. Not just the PCs, your whole building is only used 25% of the week. The amount of energy that's used outside those hours, for all kinds of things, is astonishing, to say the least.

From Hanno Fietz
I work for a large state government where we recently began implementing power management. According to our calculations based on metering a sample population shutting off PCs during idle periods saves about $35/PC per year for a mainstream business desktop with LCD monitor. Your mileage will vary, so do some testing yourself.

Laptops are generally provide for less savings, CRT monitors and workstation class devices increase savings.

We looked into the issue of hardware failure at great length, and based on research, testing, and a production implementation that's about 6 months old, I can find no evidence supporting the assertion that turning a computer on and off "wears out" anything or causes other hardware related issues. We've observed no statistically significant increase in hardware failure. (If anything, it has gone down slightly due to refresh of older equipment.)

You will find other issues, such as:
- Annoyed end-users
- Older PCs that don't like to resume properly
- Misbehaving applications
These issues don't have magic bullet fixes. You need to communicate with end users, test your applications and test your older hardware.

Chris S : We went through a similar process last year. Most research we found didn't show a large difference in the life of the computer. Considering most computers are replaced after 4-5 years, you might as well save the $200 over the life of the computer. Also, we noticed the largest differences in power consumption by replacing old technologies (like CRTs) and simple 'fixes' like enabling stand-by after an hour, or turning off the monitor after 30 minutes. More reading: http://michaelbluejay.com/electricity/computers.html

jscott : +1 for "Annoyed end-users". We've tried power management, nightly shutdowns *and* scheduled morning BIOS power-ons. People still get upset (and I don't know why). We can't win. :)

From duffbeer703
I wish I had detailed data, especially the type of numbers you want.

You might simply need to do A/B testing to see which is better.

However, I would say that the harm of turning stuff on-and-off is kind of hypothetical sounding to me:

The latter will surely save energy. In the same time turning on and off is very harmful for the equipment - hardware often breaks specifically when turned on.

First, what is the lifetime of a PC? Lets say it is 5 years. Turning it off and on again once a day really is not a lot. I remember, in my childhood, we turned off Apple's, IBM PC's and Atari class systems a dozen times a day, for years. Computers have only gotten more reliable since then.

Second, I doubt a higher failure rate can be can be uniquely associated to the daily-power down. Even if you leave your computer on all the time, you still have to reboot for system updates. In my experience, that is the source of most of my boot-time problems.

Third, the relative costs will continue to diverge over time. If you have to assume, the safe assumptions are probably: energy costs will continue to rise over time, hardware costs will decrease over time. So, even if this is break even now, this is a behavioral/cultural change, you need to start now so people will be doing it when the price benefits arrive, rather than suffering from a lag time.

The real costs to a power-down is that you have to close up all your windows and save your work. On my Macs, almost all my applications have an auto-save feature. For Windows, I usually use "hibertnate".

From benc

Database cluster... without Master/Slave?

Hi there!

I'm wondering if it is possible to have a set of SQLdb servers to which data is written and have them replicate, avoiding conflicting information.

I imagine that a Master/Slave structure would be mandatory, I would like to know if a system where servers have no hierarchy could support replication.

Currently I'm using MySQL, but I would be happy to move to another database if needed.

Any ideas? :)

From serverfault RadiantHex

You could have a multi-master setup / cluster.

See http://en.wikipedia.org/wiki/Multi-master_replication

Postgresql has similar features to Mysql but it considered more enterprise class, yet is also OSS. It is easy to get excellent support for it via #postgresql on irc.freenode.net

I'm not sure if there would be some affects on ACID though.

Warner : What would you say makes Postgres more "enterprise class"?

Brennan : Postgres has been historically more of an enterprise class database because: Better adherence to SQL Superior ACID compliance. A richer feature set It did not install a risky storage engine by default (MyIsam) Full support for foreign keys Full transactions support Please see http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems & http://troels.arvin.dk/db/rdbms/ for more details

From Brennan
Multi master setup is the only one setup whith no hierarchy to comes to my mind right know. By the way, there's the DRDB option for replicating data that might fit your needs using DRDB and Heartbeat for failover.

From Maxwell

VMWare Server :: VM set to 2gb RAM but vmware process shows 100mb physical, 1900mb virtual

I've set up a VMWare instance to run CastIron Integration Appliance. I allocated 2gb of memory to the instance, assuming it would take this as physical memory (my server has 8gb total).

When I view top however on the server, the vmware-vmx process has about 100m Resident memory and 1900m Virtual.

Running CastIron it reports that the appliance often hits 50% memory usage. Does this mean I'm using 900mb of harddrive space as memory? I wanted VMWare to use 2gb of physical memory, no swap. Can anyone tell me how to achieve this?

Setup
Debian Lenny 5.0.3
VMWare Server 2.0.2

From serverfault brad

Unless you're using ESX and making VM resource reservations your VM will not be given any more physical memory than is being used, i.e. if you give your VM 4GB but only ever address 1GB then only 1GB of physical memory is taken up.

I'm not sure where the 50% figure comes from but if that VM's vmware-vmx process is only using 100MB then that's all that's being used.

Basically don't worry about it :)

brad : well i'm worrying about it because i'm getting poor performance from castiron. I'm not sure how it reports it's 50% either, but in the vmware admin interface, it shows that my VM has 2gb memory and is using 389mb, but my vmware-vmx process shows 84m physical memory being used. what am I missing?

From Chopper3
First, vmware ALWAYS creates swap. It's required. If you do not set reservations, ESX host creates a .vswp file equal to the difference between the amount of physical memory assigned to the virtual machine and the reservation it has. By default, memory reservations are set to 0. If you have a virtual machine with 2GB of memory without a reservation, it creates a 2GB .vswp file when it is powered on. Whether it uses it or not depends on other factors (primarily do you have enough free ram on the host to support the guests requests). If you make reservations for your virtual machine's that are equal to the amount of RAM assigned to them, swapping and page sharing does not occur.

Second you can give a vm watever you want but vmware will is only going to report what the guest actually uses. When you set a number for how much ram you want to allocate this is the maximum amount of ram that server will ever use.

From Jim B
VMware Server has a setting to define if you want all VM memory to fit in physical RAM, or allow some of them to be swapped; it's in the host settings.

If you have more RAM than you're using, you can safely set it to only use RAM; you will then not be able to power on more VMs if there's not enough available physical memory, of course.

brad : Can you go into more detail? This is exactly what I'm looking for but I don't see anywhere in the web interface where I can define that. What exactly do you mean by 'host settings' ? Is this in some config file? or is it through the web admin interface?

Massimo : In the web admin interface, if you select the **host** in the inventory on the left panel, you will see a link in the right panel called "Edit Host Settings"; the option is there. This should help: http://www.virtuatopia.com/index.php/Configuring_VMware_Server_2.0_Host-Wide_Settings

brad : amazing thx....

From Massimo

When is a secondary nameserver hit?

Take this scenario:

domain: foobar.com
ns1:    2.2.2.2
ns2:    3.3.3.3

My question: Is ns2 hit just in the event that ns1 is down? Or, is ns2 hit any time that ns1 returns a miss/doesn't resolve the query? I know ns2 would be hit if ns1 ever went down; but, what if ns1 is up and just doesn't have the data?

From serverfault Evan Carroll

If NS1 doesn't have the data, NS2 will not be used. Any server that is listed as a valid DNS server for a domain is assumed to have the proper data so if NS2 says there's no such record when queried, the computer making that request will assume that is correct.

From
NOTE: This answer is about DNS client configuration. After some comment discussion, it now appears to me that the OP could be asking about DNS server or domain DNS configuration. If that is the case the premise holds (NS2 only hit if NS1 not available), but the specifics are not relevant.

The DNS client queries the secondary DNS Server only when the primary DNS server does not respond. If the primary responds with "sorry, wrong number", that response is passed back from the DNS client to the application attempting to communicate.

UPDATE

While a particular DNS client can be programmed to do whatever it wants, the standard is to go in order.

From the Microsoft Technet Understanding DNS client settings page - emphasis mine.

Configuring a DNS servers list

For DNS clients to operate effectively, a prioritized list of DNS name servers must be configured for each computer to use when it processes queries and resolves DNS names. In most cases, the client computer contacts and uses its preferred DNS server, which is the first DNS server on its locally configured list. Listed alternate DNS servers are contacted and used when the preferred server is not available. For this reason, it is important that the preferred DNS server be appropriate for continuous client use under normal conditions.

From the Ubuntu Resolver man page - emphasis mine

The different configuration options are:

nameserver Name server IP address

Internet address (in dot notation) of a name server that the resolver should query. Up to MAXNS (currently 3, see ) name servers may be listed, one per keyword. If there are multiple servers, the resolver library queries them in the order listed. If no nameserver entries are present, the default is to use the name server on the local machine. (The algorithm used is to try a name server, and if the query times out, try the next, until out of name servers, then repeat trying all the name servers until a maximum number of retries are made.)

fahadsadah : Why was this downvoted?

ktower : Because it is wrong? From the resolver's perspective, there is no such thing as "primary DNS" and "secondary DNS" servers in this case. Both are equally authoritative for the zone and equally likely to receive (and respond to) a query.

ktower : For the record, I was not the one who downvoted you. The document you quote seems to describe how a (Windows) client chooses a DNS server from a "locally configured list" to do recursive queries. The OP appears to be asking how queries arrive to DNS servers listed as being authoritative for a zone. As the NS RR has no "priority" field, any DNS server listed within the zone should be considered equally authoritative and equally likely to receive a DNS query.

tomjedrz : OK - I think - for some reason I missed that he was asking about configuring the DNS server.

From tomjedrz
if ns1 is up and just doesn't have the data?

If NS1 is "up", and returns NXDOMAIN (no data), then clients will cache it. They won't waste their bandwidth trying NS2.

I know ns2 would be hit if ns1 ever went down

This is not necessarily true: If NS1 is down (does not respond/timeout), some dns clients will simply give up.

For high availability applications, assume both nameservers are single points of failure. The terms "primary" and "secondary" are obsolete, with regards to DNS servers.

Evan Carroll : why would you ever not-hit ns2 if ns1 is down? what is it there for then?

John Gardeniers : "This is not necessarily true: If NS1 is down (does not respond/timeout), some dns clients will simply give up." Care to offer some examples of this broken behaviour?

geocar : @John Gardeniers : This is an observation from the result of a bet made running tcpdump on a pair of busy nameservers, and turning one of them off. Less than 40% of sites (assuming sites are /16) tried the other nameserver when they queried the down nameserver. Since then, I've always assumed some popular dns cache was buggy here.

John Gardeniers : @geocar, thanks for that info, which is interesting to say the least.

geocar : @Evan Carroll: pehrs answered your first question one on another answer. The reason it's *there* is largely historical; old versions of BIND were very slow, and so resolvers can try either NS response. If it times out, they can again try *either* NS response. This works well if you're trying to distribute load, but it means they cannot reliably be used for failover. Modern nameservers aren't slow enough for this to be a good reason, which is why I say they're obsolete.

From geocar

Note that which server the clients will hit depends on resolver implementation. Some resolvers will strictly go for NS1, some will randomly chose NS1 or NS2. In either case if the server responds they will not try the other server. The only time they try the other server is when the first server is unable to serve the request.

To have a look at a more realistic scenario:

#dig NS google.com 
;; QUESTION SECTION:
;google.com.            IN  NS

;; ANSWER SECTION:
google.com.     297286  IN  NS  ns3.google.com.
google.com.     297286  IN  NS  ns2.google.com.
google.com.     297286  IN  NS  ns4.google.com.
google.com.     297286  IN  NS  ns1.google.com.

;; ADDITIONAL SECTION:
ns1.google.com.     297067  IN  A   216.239.32.10
ns2.google.com.     297074  IN  A   216.239.34.10
ns3.google.com.     297074  IN  A   216.239.36.10
ns4.google.com.     297067  IN  A   216.239.38.10

And then we do it again:

#dig NS google.com
;; QUESTION SECTION:
;google.com.            IN  NS

;; ANSWER SECTION:
google.com.     297249  IN  NS  ns3.google.com.
google.com.     297249  IN  NS  ns2.google.com.
google.com.     297249  IN  NS  ns1.google.com.
google.com.     297249  IN  NS  ns4.google.com.

;; ADDITIONAL SECTION:
ns1.google.com.     297030  IN  A   216.239.32.10
ns2.google.com.     297037  IN  A   216.239.34.10
ns3.google.com.     297037  IN  A   216.239.36.10
ns4.google.com.     297030  IN  A   216.239.38.10

Here you can see how google chages the order of the nameservers to spread out the clients more evenly, to avoid exactly the scenario where multiple clients their NS1. They still include all the servers to make sure that if one goes down you will get your data through. If one of them gives bad answers you are out of luck however. It's not a situation DNS is designed to handle.

From pehrs

+1 to tomjedrz for being right, from the DNS client perspective. When a DNS client needs to resolve a DNS record it queries it's configured DNS servers, in order of precedence (Preferred then Alternate) as tomjedrz stated.

+1 to ktower for being right from the DNS server perspective when that DNS server is acting as a resolver for a DNS client.

When my computer needs to resolve a DNS name, it queries it's configured DNS servers, in order if needed. If those servers are not authoratative for the name in question they will attempt to locate and query a name server or name servers that are authorative for the domain in question, in any order, on behalf of the DNS client.

From joeqwerty

Static IP question

If I want to set a static IP for my AD DS, do I need an ISP which provides this facility? Also, if my VMS also need a static IP, would this have to be another IP or can it be the same? (I know this sounds a bit noobish).

Thanks

From serverfault blade

If you are talking about an incoming IP address, then yes the ISP must provide it to you. You could technically create your own but only your internal DNS would point to it, and no one outside your network could get to it.

You could use the same IP if you have a router in place that does NAT, or use a version of port filtering (e.g. anything incoming from outside on port x redirects to your VMS on port y).

From Theo
First, are we talking about DSL/cable style ISPs for home or small business networks?

In this case, you would likely just give your AD server a static private IP address inside your LAN (like 192.168.10.100) and create a corresponding local DNS zone. If you want a public static address (accessible from the outside), you would indeed need an ISP offering this service. I generally consider this to be a very bad idea, as there is normally no reason for an AD server to be accessible worldwide. Also, you would likely need to make your AD server the router/gateway of your network, which is an ever worse idea IMHO.

What is VMS in this context? I doubt you mean VAX/VMS :) Should you mean virtual machines, then just give them additional private (but static) addresses and set the VM network mode to bridged. Use portforwarding on your router to make them accessible from the outside.

blade : Well my internet package is known as DSL - http://www.plus.net/ , I have the basic package (does need an upgrade I am sure). Everything you said makes snese, but how would I arrive to an IP value (like you got 192.168.10.100)? VMS means Virtual Machines, indeed.

From SvenW
Short answer: No. Just set a static internal IP like 192.168.0.10. When you DCPROMO your first domain controller it complains if you're using DHCP to set the server address, and I'm guessing this is what prompted your post.

Long answer: Active Directory is an internal service that you should be running on an Internal (non-routed) IP address range (like 192.168.x.x which would not be a valid address on the Internet). If you wish to expose a part of your internal LAN to the external Internet or other networks, you need to look at using Network Address Translation (NAT) on your router. Then you'd use that to 'map' your Internet IP back to your internal IPs as needed.

As for external static IPs... you probably DO want one of these, but not for anything to do with AD. If your running a business and will be hosting any kind of externally-accessible service on your network, and serving to the Internet, then yes, you need an ISP that will provide you with a static IP. Many business-grade connections include this. You also need sufficiently powerful networking equipment that will allow you to isolate a De-Militarized network zone (DMZ).

From Chris Thorpe

What is the structure of a typical IT department?

I know most IT departments vary greatly depending on size and type of company, but I'm just wondering what the typically IT org chart would look like.

From serverfault Brett G

It depends on how you isolate the different roles: networking, systems, applications, qa, etc.. Oh yeah, don't forget your CIO, if you have one. In our case, we're under the Finance umbrella.

From
How about this one.

This one may be applicable in a larger organization.

From 3dinfluence
Kyle nailed it in that comment up there. More seriously, there isn't much in the way of 'typical'. There are many, many ways to arbitrarily draw lines between IT functions. Generally speaking, the larger the organization the more specialized IT functions can get. General Motors can afford Storage Architects and entire departments of people devoted to supporting a single (very important) application. The local community college may have three people who do it all, with a few professor-types assisting. Your local City may have IT as a sub-unit of the Finance Office, even though Public Works is one of the biggest consumers of data (all that sub-street information to keep track of).

Mmmyeah. Talking about a typical IT department is about as easy as talking about your typical mammal.

From sysadmin1138
In small and medium businesses (SME) IT departments have historically grown organically, and often haphazardly.

In large organizations, they have come to mimic typical organizational structure for a department or division, often if not a core business function (i.e. you don't sell IT services) it is viewed as a non-revenue generating expense (like legal, HR, and too often like facilities management).

You'll see a CIO or VP of (Information) Technology at the top, with managers for each divisions.

Common division titles: Operations, Development, Networking (Infrastructure), Information Management / Data warehousing / Database Group, and Client Services or Help-desk. These divisions are not uniform in naming, but often their basic or core function is identical regardless of the naming (Development is sometimes simply called Software for example).

I think ITIL and PRINCE2 might be two sources that could have more "standard" terms or org charts. I cannot think of any other practices or methodology, but anything that is an "expensive buy in, and main purposes seems to exist solely to be the basis to justify yet another re-organization" would be suitable. While this sounds (and is) cynical, I believe most of it is reasonably accurate to be suitable sources for potential answers.

mfinni : Small clarification : I can't speak to PRINCE2 or COBIT, but ITIL doesn't specify or even suggest anything about org charts. It names a lot of roles that have to be filled, but of course a given person can have more than one role - your change manager can also be the guy in charge of config mgmt, or problem mgmt. It's also completely technology-agnostic, so it definitely doesn't say anything about how a corp's Networks team interacts with the Storage team, for example.

mctylr : @mfinni, thank you, I've been avoiding ITIL at work, so I wasn't sure if it had a generic set of titles or roles than organizations adopt.

From mctylr
Support and Code monkeys at the bottom

Failed code monkeys higher up in the food chain... analysts, project managers, line managers etc

I'm quite serious: developers are mainly happy being developers and don't see BA/PM/LM as a career progression...

I'm in the IT division of a large global company.

mctylr : This is begging for links to Peter Principle and Dunning-Kruger effect. http://en.wikipedia.org/wiki/Peter_Principle and http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

From gbn
I started drawing a chart to show our IT department but as there is only one of me it just ended up looking like a circle.

3dinfluence : That's what my department looks like too.

From John Gardeniers

Implications and benefits of removing NT AUTHORITY\SYSTEM from sysadmin role?

Disclaimer: I am not a DBA. I am a database developer.

A DBA just sent a report to our data stewards and is planning to remove the NT AUTHORITY\SYSTEM account from the sysadmin role on a bunch of servers. (The probably violate some audit report they received).

I see a MSKB article that says not to do this.

From what I can tell reading a variety of disparate information on the web, a bunch of special services/operations (Volume Copy, Full Text Indexing, MOM, Windows Update) use this account even when the SQL Server and Agent service etc are all running under dedicated accounts.

From serverfault Cade Roux

If you guys have a change management process in place then challenge this. Make sure that they are aware (they really should be) of it and get their confirmation that this won't affect any services.

If you don't have a change management process to challenge this in then I would at least bring it to them. Hopefully there is a friendly relationship between your developers and administrators where you work and you'll be able to learn something from each other. They may know about the risks and might be able to explain to you why they're doing it and show you how they're doing it in a safe way.

Cade Roux : They have plenty of change management process in place, they just don't fill us lowly users in on anything, and I think because of all the process they have, they don't tend to think about things critically. Also, the DBA who sent this to people who are basically business/functional administrators for the databases (at least the ones on our server) without clearly thinking, since this is a server role and has little to do with data stewardship/guardianship at the database level.

Dynamo : Then I would definitely at least try to approach them with this. I doubt they'd be so cold as to ignore you if you go to them with a legitimate concern and a Microsoft document to back it up. I'd also suggest that it might not hurt to have someone from your group be involved at the change management level. I don't know how things work for your company but our programming manager still attends change management for his team.

Cade Roux : I have sent them the KB and raised the issue. I work in finance. I am basically THE SQL Server consultant/developer. Despite the large profitability system I have converted for them to SQL Server (and now moving to Teradata), they still refuse to treat it as a full-blown system, and I am often raising issues - but all this stuff is outsourced to some person at IBM India, so there's really no team nor any understanding of the roles or functions of the different servers. We used to call these kind of DBAs "functional DBAs" - i.e. backup/restore/security - no actual DB understanding

Dynamo : Ah I see. Well it sounds like it's a bit of a tough situation for you then. I would simply keep doing my best to raise issues where you see them. Hopefully they have a good reason for what they're doing.

From Dynamo
Have these people been asked why they want to remove it, and if they understand what the purpose of the System account is in the first place? I would agree with your guess about an audit report being involved here, and I would also guess that the report just listed which accounts have rights to do what, and that the DBAs are blindly following it by removing any accounts they don't recognise.

Basically the System account is used to give the OS itself rights to Do Stuff. It's not a general user account and shouldn't really be treated as such.

If the DBAs are determined to remove it, maybe try suggesting that they do so on a test system first (preferably one that gets some active day to day use), give it a month to see if anything happens, and then make a final decision.

Cade Roux : I've passed on what information I have gathered. At this point, I'm too busy with my own workload to train the DBAs.

From mh
The only legitimate reason for them to do this (and your MS article confirms it), is to try to prevent administrators of the server SQL is running on from having admin access to the databases. The problem they'll have is that any determined administrator of the OS can go back in and add their permission back in because they have full access to the server. It sounds like what they want to do shouldn't negatively affect anything on the DB (unless something is set up to use AD accounts that are local admins and have no specific DB permissions set), but it's definitely something that they should test first like mh suggested.

If you currently have admin access to the DB through being a local admin of the OS, I'd suggest you create a SQL user (or get another AD user account) that has sysadmin access if that is possible. This would ensure you still have access if they make this change.

Cade Roux : I've got no SQL Server admin or OS admin rights. In our particular database, I can create tables, views, UDFs, procs etc, but I can't even run sp_who to see if I'm blocked, so...

From Paul Kroon
You didn't say which version, and that's key. If you're talking SQL Server 2000 and Full text is installed, no, you cannot remove it. The reason for that is if Full Text does not run under the local System account, it could potentially throw an Access Violation and crash. And the account Full Text runs under needs sysadmin access to the SQL Server. So there you go.

As of SQL Server 2005 and higher, it depends on what accounts you have your services configured to run under. Here's the Books On-line page with the information on service accounts. Generally speaking, local or domain accounts are preferred over any of the built-in accounts for the major services, thus alleviating this concern.

Setting Up WIndows Service Accounts (SQL Server 2005 Books Online)

One other thing you didn't mention is whether BUILTIN\Administrators have been removed from the sysadmin fixed server role or not. If not, then System still has sysadmin access as it is considered a member of that local security group.

Cade Roux : I would think BUILTIN\Administrators has already been removed. So windows admins cannot just login as administrators. So this remaining "loophole" would mean that a windows admin could still execute something under the local system account and do things. Of course, they can bring the DB up in single user mode and get access...

Cade Roux : This is SQL Server 2005 and neither I nor my boss who is one of the data stewards (approve database role access to databases) being asked to sign off on this change are either windows admins or database admins - and this is a server-wide role, so the individual data stewards can't really have a say. I've passed the info up the line.

K. Brian Kelley : Or they can add themselves to whatever group you're using for DBA access and unless you're auditing AD you won't catch 'em. Or, if it's not Windows Server 2008, they can use a DLL injection attack against LSA Secrets and dump the service account password in plaintext. Or, since you're using 2005, they can simply stop the SQL Server service and copy off the database files. So this isn't much of a defense but it tends to keep the IT auditors happy.

From K. Brian Kelley

Saturday, January 29, 2011

Blog Archive