• RELEVANCY SCORE 4.81

    DB:4.81:Ha Agent On Esxi Host In Cluster Nam In Gdc Virtual Data Center Has An Error: Cannot Complete The Ha Configuration Error 4/9/2013 7:18:47 Am Reconnect Host kp





    VMware cluster operational status shows me the below:-

    ESXi host : Secondary Role : HA Issue - HA agent on ESXi host in VMware cluster in the Virtual Data center has an error: Cannot Complete the HA configuration.

    Any help is highly appreicated!!!

    DB:4.81:Ha Agent On Esxi Host In Cluster Nam In Gdc Virtual Data Center Has An Error: Cannot Complete The Ha Configuration Error 4/9/2013 7:18:47 Am Reconnect Host kp


    is this the first time you are trying to configure HA on this Host or has it worked before ?

    I would suggest you to check from scratch follow VMware article

  • RELEVANCY SCORE 3.82

    DB:3.82:Esxi 4.1 Ha Configure Error!!! 9s





    I have 2 esxi 4.1 hosts ,when configure the HA ,the error apears bellow:

    HA agent on xx.xx.xx.xx in cluster xx in xx has an error:

    cmd addnode failed for primary node:error creating ramdisk for HA agent configuration,:unknown HA error

    but other 2 esx 4.1 host is ok in this cluster

    DB:3.82:Esxi 4.1 Ha Configure Error!!! 9s


    Thanks for the solution. I was thinking scratchconfig but of course it was just add more memory to the host :-)

  • RELEVANCY SCORE 3.69

    DB:3.69:Ha Configuration Errors After Upgrading Vcenter From Vsphere 5.0 To Update 3 m7





    We have HA config issues with ESXi clusters after upgrading our virtual center vSphere 5.0 to U3.

    Error: vCenter Server is unable to find master vSphere HA agent in cluster.

    We have this error for the cluster and on each of the esxi hosts where we have installed a CPU patch, which not certified by VMware yet..

    I have tried uninstalling the microcode vib and resstarted the ESXi after which the error disappeared for the host.

    need help resolving the issue without uninstalling or rebooting the hosts.

    -shyam

    DB:3.69:Ha Configuration Errors After Upgrading Vcenter From Vsphere 5.0 To Update 3 m7


    Cold you move this discussion to the "Avialability [HAFT]" from "ESXi 5" so that it will reach a wider audience who answer.

    ~dGeorgey

  • RELEVANCY SCORE 3.51

    DB:3.51:Ha Questions dx



    In a ESXi 5.5 Cluster if i have 4 ESXi 5.5 host and i specify "host failure cluster tolerates to 1"

    what happens if 2 esxi host goes down ??

    DB:3.51:Ha Questions dx


    You have to calculate the slot size before you specify the "host failure cluster tolerates "

    What is a Slot?A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on virtual machine in the cluster.
    In other words a slot size is the worst case CPU and Memory reservation scenario in a cluster. This directly leads to the first gotcha:
    HA uses the highest CPU reservation of any given VM and the highest memory reservation of any given VM. With vSphere 4.1 if no reservations of higher than 256Mhz are set HA will use a default of 256Mhz for CPU and a default of 0MB+memory overhead for memory. With vSphere 5.0 the default for CPU has been brought down to 32Mhz.
    If VM1 has 2GHZ and 1024MB reserved and VM2 has 1GHZ and 2048MB reserved the slot size for memory will be 2048MB+memory overhead and the slot size for CPU will be 2GHZ.
    Basic design principle: Be really careful with reservations, if theres no need to have them on a per VM basis dont configure them.
    By the way, did you know that with vSphere 5.1 and the Web Client you can specify fixed slot sizes in the UI? Nice right? Keep that in mind when you see some of the advanced settings in the next section, depending on the version you are running you could potentially just configure it in the UI.

    How does HA calculate how many slots are available per host?Of course we need to know what the slot size for memory and CPU is first. Then we divide the total available CPU resources of a host by the CPU slot size and the total available Memory Resources of a host by the memory slot size. This leaves us with a slot size for both memory and CPU. The most restrictive number is the amount of slots for this host. If you have 25 CPU slots but only 5 memory slots the amount of available slots for this host will be 5.
    As you can see this can lead to very conservative consolidation ratios. With vSphere this is something thats configurable. If you have just one VM with a really high reservation you can set the following advanced settings to lower the slot size being used during these calculations: das.slotCpuInMHz or das.slotMemInMB. The advanced setting das.slotCpuInMHz and das.slotMemInMB will allow you to specify an upper boundary for your slot size. When one of your VMs has an 8GB reservation this setting can be used to define for instance an upper boundary of 1GB to avoid resource wastage and an overly conservative slot size. However when for instance das.slotMemInMB is configured to 2048MB and the lowest reservation is 500MB then the slotsize for memory will be 500MB+memory overhead. If a lower boundary needs to be specified the advanced setting das.vmMemoryMinMB or das.vmCpuMinMHz can be used. To avoid not being able to power on the VM with high reservations these VM will take up multiple slots. Keep in mind that pre-vSphere 4.1 when you were low on resources this could mean that you were not able to power-on this high reservation VM as resources would be fragmented throughout the cluster instead of located on a single host.
    As of vSphere 4.1 HA is closely integrated with DRS. When a failover occurs HA will first check if there are resources available on that host for the failover. If resources are not available HA will ask DRS to accommodate for these where possible. HA, as of 4.1, will be able to request a defragmentation of resources to accommodate for this VMs resource requirements. How cool is that?! One thing to note though is that HA will request it, but a guarantee can still not be given so you should be cautious when it comes to resource fragmentation.
    The following is an example of where resource fragmentation could lead to issues:
    If you need to use a high reservation for either CPU or Memory these options (das.slotCpuInMHz or das.slotMemInMB) could definitely be useful, there is however something that you need to know. Check this diagram and see if you spot the problem, the das.slotMemInMB has been set to 1024MB.

    Notice that the memory slot size has been set to 1024MB. VM24 has a 4GB reservation set. Because of this VM24 spans 4 slots. As you might have noticed none of the hosts has 4 slots left. Although in total there are enough slots available; they are fragmented and HA might not be able to actually boot VM24. Keep in mind that admission control does not take fragmentation of slots into account when slot sizes are manually defined with advanced settings. It does count 4 slots for VM24, but it will not verify the amount of available slots per host. As explained, as of vSphere 4.1 it will request defragmentation, but as stated it can not be guaranteed.
    Basic design principle: Avoid using advanced settings to decrease slot size as it might lead to more down time.
    Another issue that needs to be discussed is Unbalanced clusters. Unbalanced would for instance be a cluster with 5 hosts of which one contains substantially more memory than the others. What would happen to the total amount of slots in a cluster of the following specs:
    Five hosts, each host has 16GB of memory except for one host(esx5) which has recently been added and has 32GB of memory.
    One of the VMs in this cluster has 4CPUs and 4GB of memory, because there are no reservations set the memory overhead of 325MB is being used to calculate the memory slot sizes. (Its more restrictive than the CPU slot size.)

    This results in 50 slots for esx01, esx02, esx03 and esx04. However, esx05 will have 100 slots available. Although this sounds great admission control rules the host out with the most slots as it takes the worst case scenario into account. In other words; end result: 200 slot cluster. With 5 hosts of 16GB, (5 x 50) (1 x 50), the result would have been exactly the same. (Please keep in mind that this is just an example, this also goes for a CPU unbalanced cluster when CPU is most restrictive!)
    Basic design principle: Balance your clusters when using admission control and be conservative with reservations as it leads to decreased consolidation ratios.
    Go through the following article for more details. vSphere High Availability (HA) Technical Deepdive - Yellow Bricks

  • RELEVANCY SCORE 3.46

    DB:3.46:Vsphere High Availability "Election" Fails At 99% "Operation Timed Out" At 1 Of The 2 Hosts 8p



    Hello,

    We had a system with 1 ESXi 5.1 host with local disks.

    Now we install redundancy by adding an ESXi 5.5 U2 host and a vCenter 5.5 appliance.

    After installing and adding everything to vcenter, we upgraded the ESXi 5.1 to ESXi 5.5 U2. The SAN is operating correctly (vMotion is working on seperate NIC).

    Now, if I try to enable High Availability, both servers will install the HA Agent, and start "Election".

    All datastores (4) on the SAN are chosen for the HA heartbeat, isolation response is "keep powered on" default.

    One server will always get this process done, and the other will keep "electing" until it gets to 100% and errors on the election "operation timed out".

    I have seen this problem on both servers, so I think the elected "master" does not have the problem, only the "slave".

    I have checked these articles and executed them, but non worked:

    VMware KB: Reconfiguring HA (FDM) on a cluster fails with the error: Operation timed out

    - The services were running

    VMware KB: Configuring HA in VMware vCenter Server 5.x fails with the error: Operation Timed out

    - All MTU's were set to 1500

    VMware KB: Configuring VMware High Availability fails with the error: Cannot complete the configuration of the HA ag

    - the default gateway was not the same on both hosts, but I corrected this. There are no routings. HA setting is "leave powered on". After correcting and disabling/reenabling HA, problem is still the same.

    VMware KB: Verifying and reinstalling the correct version of the VMware vCenter Server agents

    - I executed "Reinstalling the ESX host management agents and HA agents on ESXi" for the HA Agent, and I verified that it was uninstalled and reinstalled when reenabling HA.

    cp /opt/vmware/uninstallers/VMware-fdm-uninstall.sh /tmpchmod +x /tmp/VMware-fdm-uninstall.sh/tmp/VMware-fdm-uninstall.sh

    I did this for both hosts. This actually fixed the election problem, and I was even able to run a HA test succesfully, but when after this test I powered down the 2nd server (to test the HA in the other direction), HA did not do the failover to the 1st and everything remained down. After pushing "reconfigure HA", the election problem appeared again on 1 of the hosts.

    These are some extractions from the logs:

    -The vSphere HA availability state of this host has changed to Election info 11/29/2014 10:03:00 PM 192.27.224.138

    -vSphere HA agent is healthy info 11/29/2014 10:02:56 PM 192.27.224.138

    -The vSphere HA availability state of this host has changed to Master info 11/29/2014 10:02:56 PM 192.27.224.138

    -The vSphere HA availability state of this host has changed to Election info 11/29/2014 10:01:26 PM 192.27.224.138

    -vSphere HA agent is healthy info 11/29/2014 10:01:22 PM 192.27.224.138

    -The vSphere HA availability state of this host has changed to Master info 11/29/2014 10:01:22 PM 192.27.224.138

    -The vSphere HA availability state of this host has changed to Election info 11/29/2014 10:03:02 PM 192.27.224.139

    -Alarm 'vSphere HA host status' on 192.27.224.139 changed from Green to Red info 11/29/2014 10:02:58 PM 192.27.224.139

    -vSphere HA agent for this host has an error: vSphere HA agent cannot be correctly installed or configured warning 11/29/2014 10:02:58 PM 192.27.224.139

    -The vSphere HA availability state of this host has changed to Initialization Error info 11/29/2014 10:02:58 PM 192.27.224.139

    -The vSphere HA availability state of this host has changed to Election info 11/29/2014 10:00:52 PM 192.27.224.139

    -Datastore DSMD3400DG2VD2 is selected for storage heartbeating monitored by the vSphere HA agent on this host info 11/29/2014 10:00:49 PM 192.27.224.139

    -Datastore DSMD3400DG2VD1 is selected for storage heartbeating monitored by the vSphere HA agent on this host info 11/29/2014 10:00:49 PM 192.27.224.139

    -Firewall configuration has changed. Operation 'enable' for rule set fdm succeeded. info 11/29/2014 10:00:45 PM 192.27.224.139

    -The vSphere HA availability state of this host has changed to Uninitialized info 11/29/2014 10:00:40 PM Reconfigure vSphere HA host 192.27.224.139 root

    -vSphere HA agent on this host is disabled info 11/29/2014 10:00:40 PM Reconfigure vSphere HA host 192.27.224.139 root

    -Reconfigure vSphere HA host 192.27.224.139 Operation timed out. root HOSTSERVER01 11/29/2014 10:00:31 PM 11/29/2014 10:00:31 PM 11/29/2014 10:02:51 PM

    -Configuring vSphere HA 192.27.224.139 Operation timed out. System HOSTSERVER01 11/29/2014 9:56:42 PM 11/29/2014 9:56:42 PM 11/29/2014 9:58:55 PM

    Can someone please provide me with some help here?

    Or extra things I can check or provide?

    I am running out of options currenty.

    Best Regards,

    Joris

    P.S. I had problems with Cold Migration when implementing the SAN. After setting up everything (vMotion, upgrading ESX), these problems were gone.

    When searching for this error, I came to this article: VMware KB: VMware vCenter Server displays the error: Failed to connect to host

    And that cause could make sense, since the vCenter server changed and IP addressing was changed during implementation.

    However, in the vpxa.cfg files, the hostip and serverip is correct (checked using https://hostip/host).

    Tried this again today, no problem at all.

    P.P.S. I have configured more of these systems from scratch in the past with no problem (though this is an 'upgrade').

    DB:3.46:Vsphere High Availability "Election" Fails At 99% "Operation Timed Out" At 1 Of The 2 Hosts 8p


    OK so the issue is fixed.

    I contacted Dell Pro Support (OEM delivering the license) and they checked the logs (fdm.log) and found out that the IP default-gateway was not reachable.

    The default gateway is the default host isolation ip address, used by HA.

    Because this is an isolated production system, the supplied gateway turned out to be only for future purposes.

    I now changed the default-gateway to a management address on the switch connected to both hosts, that is pingable.

    This solved everything.

  • RELEVANCY SCORE 3.36

    DB:3.36:Ha Configuration Issues 8k



    Probably a simply and silly question but...

    I'm having an issue with HA on one of my Clusters. At the cluster level I have a message stating "Insufficient resources to satisfy HA failover level on cluster cluster name in datacenter name. Unable to contact a primary HA agent in cluster cluster name in datacenter name

    Each of the 6 hosts in the cluster show a an error with the HA Agent on host in cluster cluster name in datacenter name has an error : Error while running health check script.

    Now, looking at the HA configur at the cluster level, I see someone (to many ahnds in the kitchen I guess) enabled DPM under Admission Control. I've disabled that so it matches my other clusters who happen to be working just fine. I selected "reconfigure HA" on all the hosts but the process is failing. "Cannot complete the configuration of the HA agent on the host. Unable to contact primary HA agent."

    So I'm thinking that if I just turn off HA on the cluster, and turn it back on that it might straighten out the issue.

    2 questions:

    1. Will that fix my issue

    2. If #1 will, does turning off HA affect the running VMs on the 6 hosts other than putting them at risk temporarily?

    Thanks,

    Casey

  • RELEVANCY SCORE 3.34

    DB:3.34:Moid Of Esxi Host 93



    Hi All,

    I have doubt regarding MOID of ESXi host. I know, it is assigned by vCenter Server to ESXi Hosts when each every Host hast same number of datastore then it is used to define the master host in HA cluster.

    My question or doubt is-----

    When is it assigned to Host? At the time of adding in HA cluster or At the time of adding in vCenter Server Instance? Plz clearify?

    DB:3.34:Moid Of Esxi Host 93


    Get-VMHost | Select-Object Name,ID

    http://communities.vmware.com/thread/333155

  • RELEVANCY SCORE 3.33

    DB:3.33:Relocate Virtual Machine x7



    Hi!

    Question -

    I have several ESXi in DRS/HA cluster. If I'll reboot ESXi, the VMs on it will shutdown. But VMs, which were not started, will be relocated to another ESXi host.

    what is necessary to do to stop relocation of virtual machine when ESXi reboots?

    DB:3.33:Relocate Virtual Machine x7


    it works, but is it a good/correct solution? Why DRS automation level 'Manual' doesn't work here? Affinity rules require dedicated groups per each server, rules per ESXi-VMs, whereas DRS in 'Manual' state announced as a DRS where user can decide which Vm will move to which ESXi, am I correct?

    update :

    also, why DRS automation level 'Manual' doesn't work only with switched off VMs? it works correctly with VMs, which worked when ESXi began reboot

    Message was edited by: Ilya

  • RELEVANCY SCORE 3.33

    DB:3.33:How To Add The Esxi Host To Active Directory After Installing zz



    I have installed two ESXi host in a workstation after that i want to add them to active directory for enabling HA on the cluster. Can any one tell me how to add them in to AD.

    DB:3.33:How To Add The Esxi Host To Active Directory After Installing zz


    check this out: http://technodrone.blogspot.com/2010/07/esxi-41-

  • RELEVANCY SCORE 3.29

    DB:3.29:Vmodl.Fauly.Hostcommunication Error ps



    I am having problems in our Vcenter, one of the host just diconnected and when we try to re connect it it goes up untill 89% and then comes up with an error "Internal error : vmodl.fauly.hostcommunication". we dont use DNS but ping from vcenter to host responds fine, ping from host to vcenter works fine. telnet from either on port 902 is also working fine. the vxpa agent service is running on the esxi host. esxi host is versin 5.0. Host is part of HA in a cluster. we can connect to host directly via vshpere client. I have deleted the vpxuser from host, unistalled the vcenter agent and then tried connecting to host again and same problem. I have also removed host from Vcenter and when I try adding it back I get the same error. I have also tried adding host to another Vcenter and same problem. Any one experienced this problem before.

    DB:3.29:Vmodl.Fauly.Hostcommunication Error ps


    I am having problems in our Vcenter, one of the host just diconnected and when we try to re connect it it goes up untill 89% and then comes up with an error "Internal error : vmodl.fauly.hostcommunication". we dont use DNS but ping from vcenter to host responds fine, ping from host to vcenter works fine. telnet from either on port 902 is also working fine. the vxpa agent service is running on the esxi host. esxi host is versin 5.0. Host is part of HA in a cluster. we can connect to host directly via vshpere client. I have deleted the vpxuser from host, unistalled the vcenter agent and then tried connecting to host again and same problem. I have also removed host from Vcenter and when I try adding it back I get the same error. I have also tried adding host to another Vcenter and same problem. Any one experienced this problem before.

  • RELEVANCY SCORE 3.28

    DB:3.28:Ha Error. Where Can I Find Log In Esxi Server? jc



    Hi all:

    In my Lab, there are 2 esx servers in a cluster. One is ESXi3.5 another is ESX 3.5.

    The error message shows on ESXi Server: HA agent has error.

    Where can I find the log file in ESXi and HOW?

    Thanks a lot.

    DB:3.28:Ha Error. Where Can I Find Log In Esxi Server? jc

    exact, try with the f2, then, if you need more details try with the RCLI interface.

  • RELEVANCY SCORE 3.28

    DB:3.28:How To Verify If Ha Agent Is Properly Enabled On A Particular Host In Ha Cluster dj


    Hello list,

    I have used VI toolkit and found pretty useful for doing automation . Thanks to the list.

    I have a query regarding how to enable the HA agent on a particular host through VI Toolkit cmdlets.

    I have tried the following.

    Get-Cluster | Get-View | %{

    foreach($h in $_.Host){

    if((Get-View $h).Name -eq $esxtgt){

    Write-Host $esxtgt

    Write-Host "`tDRS enabled : " $_.Configuration.DRSConfig.Enabled

    Write-Host "`tHA enabled : " $_.Configuration.DasConfig.Enabled

    }

    }

    }

    But this will give you whether the cluster you have created is enabled with HA / DRS. actually what i need is how to see the status of HA agent in a particular host in the HA cluster. your suggestions will be greatly appreciated.

    thanks,

    Krishnaprasad

  • RELEVANCY SCORE 3.27

    DB:3.27:Ha Errors After Ha Has Been Disabled ff


    I have recently disabled HA on a Lab Manger ESX Cluster. However, we are getting the Yellow configuration Issues box with a messagfe about "HA agent on esx-8.xxx.xx.xx in cluster Lab Manager in Primary has an errror

    Is there actually something wrong?

    The HA agent is not running on the host

    DB:3.27:Ha Errors After Ha Has Been Disabled ff


    I've seen this happen a couple of times. When I restarted the mgmt-vmware agent on the specific host the issue was solved. Also enabling HA and disabling it again has solved the warning.

    Duncan

    VMware Communities User Moderator | VCP | VCDX

    -

    Blogging: http://www.yellow-bricks.com

    Twitter: http://www.twitter.com/depping

    If you find this information useful, please award points for "correct" or "helpful".

  • RELEVANCY SCORE 3.25

    DB:3.25:Ha Agent Has An Error Incompatible Ha Networks 38



    Hi,

    I'm installing a new vmware enviroment and adding a host to cluster i received this error message:

    HA agent has an error Incompatible HA Networks: Host has network(s) that don't exist on cluster members: 10.10.2.220: cluster has networks missing on host 10.10.1.220

    Consider using advanced cluster settings das.allowNetwork to control network usage.

    Find attached my network configuration.

    Thanks in advance,

    ER

  • RELEVANCY SCORE 3.25

    DB:3.25:Problem With Vsphere Ha (Vsphere Ha Agent Status Election Or Unreachable) p3



    We have 8 node cluster (Esxi 5.5)

    After rebooting the switch stops working vMotion.

    And the HA agent's status has changed to Election or Unreachable (Excluding one node which has the HA agent status a master).

    Tried to disable and re-enable XP on a cluster. Tried to delete a node from the cluster and enter it again.

    What could be the problem?

    DB:3.25:Problem With Vsphere Ha (Vsphere Ha Agent Status Election Or Unreachable) p3


    Have you checked /var/logs/vmkernel.log and fdm.log? That would probably tell you if there is network / configuration issues. It sounds like the hosts cannot see other on management interface.

  • RELEVANCY SCORE 3.23

    DB:3.23:A Cluster W Esx And Esxi Servers 9x



    is it possible to have an HA cluster with ESX and ESXi servers. (I assume so)

    when I add an ESXi server to my ESX server cluster is saying Misconfiguration in Host network, and cannot configure HA

    What host network is it referring to? and what should be done?

    thx.

    DB:3.23:A Cluster W Esx And Esxi Servers 9x

    You have to add das.allownetwork parameters for each type of ESX server. http://kb.vmware.com/kb/1006541

    If you're using only one service console on your ESX host(s) and you kept the names default...add 2 entries like:

    das.allownetwork0 = "Service Console"

    das.allownetwork1 = "Management Console"

    Which is what the bottome illustration in the above kb shows.

  • RELEVANCY SCORE 3.23

    DB:3.23:Ha Reconfigure Issue 9s



    Dear All,

    I am facing issue with HA. I have created cluster with 4 esxi hosts. after 2 days I observed red mark on two of the ESXi hosts. I have reviewed the task and events of those hosts. Alarm was releated with HA. I tried to reconfigure the HA on both Esxi host. I did not work. Now I am getting error as belw mentioned.

    Reconfigure HA host122.62.242.154Cannot complete the configuration of the HA agent on the host. Other HA configuration error..

  • RELEVANCY SCORE 3.20

    DB:3.20:Cannot Complete The Configuration Of The Ha Agent On The Host Misconfiguration In The Host Network 3x



    i have 6 host in one cluster .Out of six , one machine giving the error (it was added today to the the existing cluster)

    i dont have distributed switch in the cluster

    Tried reconfigure for vmware HA , but no luck

    tried disable ' turn on vmware HA ' and enable on the cluster but no avail

    What could be the problem ?

    DB:3.20:Cannot Complete The Configuration Of The Ha Agent On The Host Misconfiguration In The Host Network 3x


    A misconfiguration of networks used for the service console might be the issue... Here's a KB you might wanna check out:

    http://kb.vmware.com/kb/1019200

    /Rubeck

  • RELEVANCY SCORE 3.19

    DB:3.19:Cluster (Ha) Status Shows Protected Vm's ? sj



    Hi all,

    I am investigating why VM's did not fail over to another host in a HA cluster, after a Purple Screen of Death occured on the failing host.

    What i see is that the HA Agent is not correctly running on the working host. So that could explain why the HA couldnt do its job.

    When i go to the cluster and open the Cluster Status, i see 0 hosts connected to the master. And on the second tab i see there are 7 VM's not protected and 10 protected.

    My question is, how is it possible that 10 VM 's are protected, if the other host in the cluster doesnt have a functional HA agent? The 10 VM 's i am talking about are running on the host with a good working HA agent. But you need at least one other host that can connect to the Master. Right?

    DB:3.19:Cluster (Ha) Status Shows Protected Vm's ? sj


    Hi all,

    I am investigating why VM's did not fail over to another host in a HA cluster, after a Purple Screen of Death occured on the failing host.

    What i see is that the HA Agent is not correctly running on the working host. So that could explain why the HA couldnt do its job.

    When i go to the cluster and open the Cluster Status, i see 0 hosts connected to the master. And on the second tab i see there are 7 VM's not protected and 10 protected.

    My question is, how is it possible that 10 VM 's are protected, if the other host in the cluster doesnt have a functional HA agent? The 10 VM 's i am talking about are running on the host with a good working HA agent. But you need at least one other host that can connect to the Master. Right?

  • RELEVANCY SCORE 3.19

    DB:3.19:Can Not Complete The Configuration Of Ha Agent On The Host. Other Ha Configuration Error m1



    Dear Team,

    I am facing a issue while configuring the HA on one server. Scenario is as follows:-

    Six month back I have install the VMWare solution for our organisation. It is ESXi 4.1. At that time there was two server in the cluster and both were configured with HA sucessfully.

    Now I have added one more server in the Cluster but when I am trying to configure the HA on the same system, It is giving the error " Can not Complete the configuration of HA agent on the host. Other HA configuration error." (Screen Shot is attached) So many times I right click the server and select Reconfigure VMware HA but didn't succeded.

    Could anyone please help me. It is a live system and very critical node. So need help on urgent basis.

    Additionaly, we have also purchased 1 Year Technical Support (TS) for our prodcut. Could anyone please guide me how can I raise a ticket for this.

    Thanks Regards,

    Gaurav Asthana


    DB:3.19:Can Not Complete The Configuration Of Ha Agent On The Host. Other Ha Configuration Error m1


    One thing I also want to add here that after 97% complete I am getting this error. Please give me a solution where I don't need to remove this server from cluster and which also not affect the service.

  • RELEVANCY SCORE 3.19

    DB:3.19:How Does Ha Cluster Host Behave When One Host Fails That Holds A Mscs Cib (Ms Cluster On One Host) 9m



    Hello everyone,

    i got a quite simple question hoping to get the answer here in the vmware community, before i start testing to get an concrete answer.

    My Situation:

    I am running a vSphere 5.1 U2 ESXi 2 Host HA Cluster in my vCenter Datacenter where more HA Clusters reside.

    This 2 Host ESXi Cluster i have especially provided for Microsoft failover clustering VMs.

    Some of my MSCS run as CAB (with RDM) and one as CIB (with vmdk). Further i have DRS affinity and non affinity rules defined.

    Now my questions:

    Depending on the MSCS as Cluster in a Box (CIB) Scenario i got two VMs on one ESXi Host.

    What happens if this ESXi Host fails, which ist hosting the CIB VMs? (Remember i got a second ESXi Host!).

    If the ESXi Host1 fails, the MSCS CIB VMs will also fail, this is clear, but will the CIB VMs start on the second remaining ESXi Host2 automaticly or not?

    I hope someone can easily answer this question for me, i can only suppose it is so and would be happy if anyone can confirm.

    THX in advance

    nicole

  • RELEVANCY SCORE 3.18

    DB:3.18:Errors In Configuration Ha Agent On The Host 9z


    Hello!

    I have a problem with vmware HA.

    I have 2 hosts in the cluster (ESX 4.0.0) and one Vcenter server 4.0

    I set enable HA in the cluster but my hosts can't configuring HA agent.

    It's give such error: Cannot complete the configuration of the HA agent on the host. See the task details for additional information. Other HA configuration error.

    In the task details for this task: cmd addnode failed for secondary node: Internal AAM Error - agent cold not start.: Unknown HA error

    How can i fix this problem?

    Thank you!

    DB:3.18:Errors In Configuration Ha Agent On The Host 9z

    Thank you!

    My problem fixed by correcting dns configuration for the vcenter dns name as you said.

    Thanks for Yours help !

  • RELEVANCY SCORE 3.17

    DB:3.17:Esxi 5.1 Host Ha Status Is Showing As Election 3f



    Hi,

    I have a 4 node ESXi 5.1 HA and DRS Cluster. Recently I have replaced the SSL certificates of all these hosts and before performing this activity I had disabled the HA. Now, post replacing the SSL certificates of the hosts - when I'm enabling the HA on the cluster, only 1 ESXi host is showing HA status as - "Running(Master) and rest all thress hosts are showing "Election" HA status.

    Tried to re-install the HA agent on the host but still issue persists. Also tried the re-configure the HA on ESXi hosts, disabling and enabling the HA on the cluster but no luck.

    Can someone help me to get this resolved.

    Thanks,

    KC

    DB:3.17:Esxi 5.1 Host Ha Status Is Showing As Election 3f


    Hi All,

    Thanks for all your response.

    Issue is resolved without using the HostReconnect.pl script. I have followed the below steps and now all the ESXi hosts are showing normal HA status.

    1) Put the host in Maintenance Mode. Disconnect and remove it from vCenter server.

    2) Restart the Management Agent through DCUI.

    3) Uninstalled the HA agent using command line.

    4) Add the host again in Cluster.

    5) Exit from Maintenance Mode.

    By Following these steps - successfully configured the HA on all the four ESXi hosts. Now one host is showing as Master and rest three hosts are showing as Slave HA status.

    Regards,

    KC

  • RELEVANCY SCORE 3.17

    DB:3.17:Cannot Complete The Configuration Of The Ha Agent On The Host. x3



    Hello,

    I did a vCenter upgrade from vCenter 4.0 to 4.1, upgrade complete sucessfully. But i get error in mu cluster as below

    ESX version - 4.0.0 build 208167

    Cannot complete the configuration of the HA agent on the host. Misconfiguration in the host network setup system.

    Tried to disable the HA and re-configured the HA, still i get the same error.

    Checked HA agent version on ESX and it seems to be fine and also able to reach all ESX service console from other ESX servers

    Thanks!!

    DB:3.17:Cannot Complete The Configuration Of The Ha Agent On The Host. x3


    Exactly the same problem I had.

    Suddenly I wasn't able to enable HA in one ESXi host.

    I don't know why, but I had 2 VMkernel Ports enabled for Management. I just disabled the Management traffic in one of the ports, and the problem was solved.

    Thank you blahphish.

    Omar

  • RELEVANCY SCORE 3.16

    DB:3.16:Ha Error mj



    i have a 3 node cluster

    1 host is esxi 4.1 u1

    the 2 others are 4.0 esx

    Ha is getting an error when i enable it on both esx 4,0 hosts.

    Ha agent in cluster has an error: cannot complete Ha configuration

    task and events:

    /opt/vmware/aam/bin/ft_startup failed to complete within 3 minutes: Unknown Ha error

    DB:3.16:Ha Error mj


    i am not sure I have found the issue yet but I fouund out that my vsphere version was older than the esxi host that I was trying to put into a cluster. since I upgraded to vsphere u1 345043. but even after i upgraded I was still having HA problems. kind of a hit and miss thing. I would disconnect a host, remove it, put it back in a cluster. sometimes it would work sometimes it wouldnt. I tried disabling HA in the cluster level most of the times it fails. vmotion also timed out.

    Also I noticed that everytime I remove a host and put it back into a cluster, It loses all vds configuration. I had to manually add a host back into the dvswitch.

    since upgrading all hosts to esxi 4.1 u1, its been working ok.

  • RELEVANCY SCORE 3.16

    DB:3.16:Ha Agent Disabled On Diskless Esxi af



    I have an HA cluster of 3 ESXi hosts - 2 diskless with embedded ESXi, and one installable.

    Some time ago one of diskless hosts said that HA agent is disabled, and since that I cannot re-enable agent.

    Build 123629

    -- ERROR task-internal-1288176 -- -- DasConfig.ConfigureHost: vim.fault.AgentInstallFailed:

    (vim.fault.AgentInstallFailed) {

    dynamicType = ,

    msg = ""

    }

    -- FINISH task-4870 -- host-1537 -- vim.HostSystem.reconfigureDAS -- 629A554D-46A9-4908-BA57-09F3E3413ECD

    -- ERROR task-4870 -- host-1537 -- vim.HostSystem.reconfigureDAS: vim.fault.AgentInstallFailed:

    (vim.fault.AgentInstallFailed) {

    dynamicType = ---

    http://blog.vadmin.ru

    DB:3.16:Ha Agent Disabled On Diskless Esxi af


    Problem solved - this is hardware problem with HP USB keys shipped between March 31, 2008 and September 15, 2008.

    ---

    http://blog.vadmin.ru

  • RELEVANCY SCORE 3.15

    DB:3.15:Error: "The Vsphere Ha Agent On This Host Cannot Reach Some Of The Management Network Addresses Of Other Hosts, And Ha May Not Be Able To Restart Vms If A Host Failure Occurs..." kk



    I just want to verify that this error is the result of normal behavior when a host goes down in a cluster. We are migrating to vCenter 5.1 (from version 4.1) and are testing three ESXi 4.1 servers in a cluster in vCenter 5.1. Part of this was to test HA.

    I connected to one of the ESXi hosts via ILO and restarted the host directly from the console. When the server went down for a reboot, the VMs on that server failed over to the other ESXi hosts in the cluster as expected. A few minutes later, while the one host is still down and in the process of rebooting, we receive an alert on the first node in the cluster (shows HA state of "Connected (Slave)") stating:

    "The vSphere HA agent on this host cannot reach some of the management network addresses of other hosts, and HA may not be able to restart VMs if a host failure occurs: {servername / IP]" (the server name and IP being that of the server that we rebooted from the ESXi console)

    The second host in our cluster shows the HA status as "Connected (Master)" and this alert does not show for this server.

    Once the third node is back up from the reboot, the error goes away. Is this message normal when a server in a cluster goes down in vCenter 5.1?

    Thanks

    DB:3.15:Error: "The Vsphere Ha Agent On This Host Cannot Reach Some Of The Management Network Addresses Of Other Hosts, And Ha May Not Be Able To Restart Vms If A Host Failure Occurs..." kk


    actually node 1 shows as HA slave (where I receieved the error). With 3 hosts online in the cluster, I show current failover capacity as 2 hosts (we've set to allow 1 host in HA). We only have 5 VMs on this cluster, the hosts have 2 quad core processors with 128gb of ram a piece.

    When I take a host down, the VMs fail over but I receive the message on host 1 (HA slave) that I specified in my first post. With 1 host down current failover capacity changes from 2 to 1 but I still receive the message that "the HA agent on this host cannot reach some of the management network addresses and may not be able to start VMs if a host failure occurs". I just wanted to confirm if this message is normal whenever you lose a host in an HA cluster in vCenter 5.1?

  • RELEVANCY SCORE 3.15

    DB:3.15:Ha Agent On Server.Domain In Cluster Server.Domain Has An Error p3



    I have an ESX 3.5 Host that is reporting "HA agent on lt;server.domaingt; in cluster lt;server.domaingt; has an error" in the Host summary screen. The problem is that this host is not a member of a cluster and never has been. I have no option to configure HA. The host it's VMs appear to be running fine so this may be only a cosmetic issue but I would like to clear it up. Thanks for any help.

  • RELEVANCY SCORE 3.15

    DB:3.15:Ha With Esxi Embedded zs



    I have just got an evaluation of VMware Infrastructure working with some HP BL495 blades using ESXi embedded (on internal USB flash drives).

    I have configured a datastore on an NFS partition on a NetApp datastores and the VMs run fine and VMotion works correctly.

    However, when I try to create a cluster I get the following error:

    HA agent has an error: Host in HA Cluster must have userworld swap enabled

    I have tried following the KB article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1004177 but am not what path I should use with an NFS data store. All the obvious options fail.

    I have seen a few comments around the web that suggest that it's not possible to use HA with userworld swap on NFS or iSCSI storage. Is this really the case? Is there any workaround if it is?

    DB:3.15:Ha With Esxi Embedded zs

    If you create a unique subdir off the NFS datastore for each ESXi host, you can point to it (make sure that your path in the scratch.config lcation is correct - a typo results in an error of course).

    I agree that this should be in a VMware KB article - as it's not NetApp behavior - it's true of any NFS server (I did it with an EMC Celerra).

  • RELEVANCY SCORE 3.14

    DB:3.14:"An Error Occured During The Configuration Of The Ha Agent On The Host." da



    Hi,

    I have the following error when attempting to configure HA on one of my ESXi Hosts,* "An error occured during the configuration of the HA Agent on the host.".* Expanding te error it states "Host in HA Cluster must have the userworld swap enabled."

    I take it that I have to follow the advice as directed by this article http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1004177 and create a location on the ESXi host to haouse the userworld swap, but am not sure of the commands to use in Remote CLI in order to create an appropriate location.

    Can anyone provide some assistance as to how to resolve this particular issue.

    Thanking you in anticipation,

    DB:3.14:"An Error Occured During The Configuration Of The Ha Agent On The Host." da


    Hi,

    I have the following error when attempting to configure HA on one of my ESXi Hosts,* "An error occured during the configuration of the HA Agent on the host.".* Expanding te error it states "Host in HA Cluster must have the userworld swap enabled."

    I take it that I have to follow the advice as directed by this article http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1004177 and create a location on the ESXi host to haouse the userworld swap, but am not sure of the commands to use in Remote CLI in order to create an appropriate location.

    Can anyone provide some assistance as to how to resolve this particular issue.

    Thanking you in anticipation,

  • RELEVANCY SCORE 3.14

    DB:3.14:Esxi Ha Agent 37



    Hi,

    this morning, i saw a message from one of our esx host that there was a problem with the HA Agent. And in the v-Check script from Alan, i saw that one host not respoding.

    But there where no issues with our vms running on that host.

    So i guess it was not isolated from the network and from the datastore. I dig a little bit deeper to analyse the fdm.log, but i could not get the issue. I saw some entrys like.

    "[ClusterElection::ChangeState] Startup = Uninitialized : Not connected to host agent"

    "Change state to Uninitialized:1218918708"

    "New cluster state is 0"

    I attached the fdm.log, where the event happens. If someone could please take a look at the logfile and give me some hints?

    Thanks

    Frank

    DB:3.14:Esxi Ha Agent 37


    I dont see any issues. Our server monitoring tool dont show me any loss of connection.

    vmkernel.log show me:

    2012-03-21T16:15:13.337Z cpu9:1316301)Config: 346: "SIOControlFlag2" = 0, Old Value: 0, (Status: 0x0)2012-03-21T16:15:19.371Z cpu12:1316301)Config: 346: "VMOverheadGrowthLimit" = -1, Old Value: 0, (Status: 0x0)2012-03-21T16:15:19.549Z cpu1:2716)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x1a (0x4124011c9400) to dev "mpx.vmhba0:C0:T0:L0" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.Act:NONE2012-03-21T16:15:19.549Z cpu1:2716)ScsiDeviceIO: 2316: Cmd(0x4124011c9400) 0x1a, CmdSN 0x67f00 to dev "mpx.vmhba0:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.2012-03-21T16:15:19.924Z cpu4:1316301)Config: 446: "Vmknic" = "vmk1", Old value: "vmk1,vmk3" (Status: 0x0)2012-03-21T16:15:19.956Z cpu4:1316301)Config: 446: "Vmknic" = "vmk1,vmk3", Old value: "vmk1" (Status: 0x0)

    On our switch, the ports from the management network dont no errors and no up - down change.

    Frank

  • RELEVANCY SCORE 3.14

    DB:3.14:How To Resolve Ha Problem On Esxi Host js



    Hi,

    How to resolve HA problem on ESXi host without removing from Cluster.

    Thanks,

    Tushar

    DB:3.14:How To Resolve Ha Problem On Esxi Host js


    As mentioned by schepp it's hard to find a solution for an unknown issue

    Anyway, issues with HA can sometimes be resolved by right clicking the host in the inventory and "Reconfigure for HA".

    Andr

  • RELEVANCY SCORE 3.13

    DB:3.13:What If The Esxi Cant Ping The Gateway 17



    Hi Due to some network issue from ESXi hosts cant ping the gateway due to this i think the hosts get disconnected after some time in cluster and gives the following error:

    vsphere ha agent on this host could not reach isolation address gateway

    Can you please confirm is the above error due to gateway issue....I have also tried to change the hostisolation address for a testing purpose i give another ESXi server IP in hostisolation. Disabled and re-enabled HA again but even now servers get disconnected from vCenter and goes into not responding state. PFA image for host isolation settings.

    DB:3.13:What If The Esxi Cant Ping The Gateway 17


    Well there seems to be two things going on here

    1. Connectivity between the VC and ESXi hosts

    2. Isolation address issues

    Let's take this one beast at a time.

    In order to have the hosts "connected" to VC there seriously are just two requirements

    1. You should have port 443 open and that would enable the VC to push the HA/vpx etc agents to the ESX hosts

    2. UDP traffic has to be allowed between the VC and ESX hosts else the hosts would connect and disconnect from the VC after a minute

    http://kb.vmware.com/kb/2040630

    The KB above should help with the hosts connection issue

    Now, if the hosts are getting disconnected, the hosts are likely not able to heartbeat with VC and each other as well - hence this is likely a by-product of the other issue.

    So to get to a resolution, get the hosts connected first - as that would also help in the HA issues you are seeing!

  • RELEVANCY SCORE 3.12

    DB:3.12:Upgrade Esx 4.1 To Esxi 4.1 In A Cluster Question 9p



    We have a cluster with 4 hosts. All hosts are running ESX 4.1. This Cluster has HA and DRS turn on. Now, I am planning to upgrade those host from ESX 4.1 to ESXi 4.1 one by one. Do I have to turn off HA and DRS before the upgrade?

    DB:3.12:Upgrade Esx 4.1 To Esxi 4.1 In A Cluster Question 9p


    Basically you can run ESX and ESXi in a mixed cluster. Just in case you run into an issue with configuring HA, http://kb.vmware.com/kb/1019200 might help.

    Andr

  • RELEVANCY SCORE 3.11

    DB:3.11:Question About Ha Issue p3


    After enable HA setting in the HA, there is configuration issues showing " Insufficient resources to satsify HA failover level on cluster", BTW, in one of the host, there is also configuration issues showing "HA agent disabled". I try diable HA and enable HA, no working. Try reconfigure for VMware HA, not working either.

    What shall I do?

    DB:3.11:Question About Ha Issue p3


    glad to see you got it resolved.

    please consider the use of the "helpful" buttons to award points for answers you found useful

  • RELEVANCY SCORE 3.08

    DB:3.08:After 4.1 Upgrade, 1 Host "Error While Running Health Check Script" - What To Check Next? 7k



    I had an ESXi 4 update 1 host which I used update manager to upgrade to 4.1. Upon rejoining the cluster, I receive a red mark and the host states

    "HA agent on has an error : Error while running health check script"

    Two hosts in a separate cluster do not have the same issue.

    Thus far I've

    Reconfigured HA

    Disabled HA/Enabled HA for the cluster

    Restarted management agents

    Restarted Server

    Put server in another cluster

    ...all result in the same error.

    Any ideas?

    DB:3.08:After 4.1 Upgrade, 1 Host "Error While Running Health Check Script" - What To Check Next? 7k

    AllBlack wrote:

    Did you see any other issues after upgrading to vcenter 4.1 such query service not working or getting time-outs when sorting lists?

    Cheers

    Please consider marking my answer as "helpful" or "correct"

    I am also seeing this error, but as of now it's at single client site and seems to not to be hurting anything. I will want find the fix before I update the hosts to 4.1

  • RELEVANCY SCORE 3.07

    DB:3.07:Unable To Enable Ha Agent Or Hot Vmotion On Esxi 3.5 123629 Host In Cluster cd


    I am unable to enable HA agent or hot vMotion any VMs on an ESXi 3.5 123629 host in a HA/DRS/EVC cluster. The funny thing is that HA and vMotion were working fine on all three hosts in the cluster, but recently stopped working on one of them. No changes were made.

    Two of the hosts are HP DL380 G5s and one is a DL360 G5. Two of the servers are running ESXi 3.5 123629 and one is running ESX 3.5.0 123630. All three hosts have the same DNS info. I am able to ping and resolve all three ESX hosts by name.

    I have disabled and re-enabled HA on the cluster, but one of the hosts still fails with the error An error occurred during configuration of the HA agent of the host. - cmd remove failed.

    I tried to vMotion from the host with the HA error and it stops at 10% before timing out. I can cold migrate the VMs from that same host to one of the two other hosts.

    I am hoping that I don't have to power down all of the VMs on the host with the issue and then reboot the host. I am not even sure that will help. Any ideas? THANKS

    DB:3.07:Unable To Enable Ha Agent Or Hot Vmotion On Esxi 3.5 123629 Host In Cluster cd

    Thats a normal item to see. vmkernel ip's are not part of the VM networks and are not enumerated there. Only the VM networks would be part of the list.

    Are there any other vmkernel nets? Like iSCSI etc.

    Do a vmkping from one host to the other for the vmotion network.

    e.g

    server1 vmotion ip = 10.10.10.1

    server2 vmotion ip = 10.10.10.2

    from server1

    vmkping 10.10.10.2

    If it responds then the network is not the issue.

    http://blog.laspina.ca/

    vExpert 2009

  • RELEVANCY SCORE 3.07

    DB:3.07:Reinstall Esxi On Host In Existing Cluster. d3



    Hi all,

    I have three hosts running ESXi 5, and a physical vCenter server, the hosts are all in a HA/DRS cluster. I need to reinstall ESXi on each host because of an incorrect raid configuration on each host. Was wondering what the correct steps to safely do this as I have a few guests running.

    I was thinking:

    1) Enter host into maintenance, disconnect

    2) Rebuild host and reconfigure networking options

    3) Add to vCenter again

    4) Repeat for hosts 2 3

    Is that it?

    Anything I'm missing? Will anything with the licensing be affected?

    Thanks in advance!

    DB:3.07:Reinstall Esxi On Host In Existing Cluster. d3


    If you have any templates, it'll save some headache if you convert them to VMs first.

    Otherwise, that all looks fine. I'm sure you meant to imply this step when adding it to vCenter, but add it back into the cluster and let it finish the HA configuration to be safe, before you put another in maintenance mode.

    It will let you reassign the license back to that host when you connect it back to vCenter. Nothing to worry about there.

  • RELEVANCY SCORE 3.07

    DB:3.07:Ha Agent Error When Reboot First Node In Cluster 3z



    Hi all,

    I have the following problem:

    Running both nodes in cluster and rebooting first node; I received after node restarts the following error:

    HA agent on "node2.dom.es" in cluster "cluster name" in "datacenter" has an error.

    I have see lot of post about the same issue...

    I have configured host file, test names resolutions, reload HA, etc

    But I have the same problem.

    Please, any idea?

    DB:3.07:Ha Agent Error When Reboot First Node In Cluster 3z


    Did you try a new cluster? Or manual install of the vpxd file?

  • RELEVANCY SCORE 3.06

    DB:3.06:When Does Vcenter Push An Updated Ha Agent? 7s



    If you have a 5.1U1 cluster, and then update vCenter to U2. When does a new vmware-fdm vib get installed on the esxi host? Do you have to put a U1 host in maint mode, than back out of maint mode? Do you need to turn HA off on the cluster, then back on and he puts the newer vmware-fdm on all hosts?

  • RELEVANCY SCORE 3.05

    DB:3.05:Enable To Contact Primary Ha Agent In Cluster Xxx In Xx aj



    When I try to configure DRS and HA on my 3 ESX3.0.1 servers, it gives me an error: Enable to contact primary HA agent in cluster Montvale(Name of Cluster) in NJ(Name of data Center)

    It also says that insuficient resources to satisfy HA failover level on cluster (Montvale) in (NJ)..

    When I disable DRS and HA from the cluster NJ, it gives me an error on ESX host, like this:

    Configuration issues

    HA Agent on X.X.X.X (host IP Address) in cluster Montvale in NJ has an error.

    Please help on this...

    Thanks.

    Message was edited by: ketalparikh

    DB:3.05:Enable To Contact Primary Ha Agent In Cluster Xxx In Xx aj

    What I meant was the issue with starting of vms - you have another problems there.

    Check what the other mentioned - possible you have to recreate your cluster.

    Be patient during disabling of ha - wait longer time and then try to enable ha again.

  • RELEVANCY SCORE 3.05

    DB:3.05:Need To Move An Esxi 5 Host To A Cluster - Will Vm Stay Up? z7



    I have one ESXi 5.0 host running in my vSphere datacenter that is a stand alone host with it's resources listed as "Grafted from IP Address". This was a test installation that now has one production VM running on it. If I want to move this host into an already existing DRS/HA cluster, can I just choose "Disconnect" on the host, then select the cluster to which I want to add it, select "Add host to cluster" to put in in that cluster, and still have the production VM running on the stand alone host remain running while the host successfully is added to the cluster?

    And no, I will never have a stand alone host again. :-)

    Thanks!

    DB:3.05:Need To Move An Esxi 5 Host To A Cluster - Will Vm Stay Up? z7


    scorchUGA wrote:

    I have one ESXi 5.0 host running in my vSphere datacenter that is a stand alone host with it's resources listed as "Grafted from IP Address". This was a test installation that now has one production VM running on it. If I want to move this host into an already existing DRS/HA cluster, can I just choose "Disconnect" on the host, then select the cluster to which I want to add it, select "Add host to cluster" to put in in that cluster, and still have the production VM running on the stand alone host remain running while the host successfully is added to the cluster?

    And no, I will never have a stand alone host again. :-)

    Thanks!

    that will work fine.

    the vm doesnt go down just becasue it isnt conencted to a vcenter.

  • RELEVANCY SCORE 3.04

    DB:3.04:Cmd Addnode Failed For Secondary Node: Internal Aam Error - Agent Could Not Start.: Unknown Ha Error 9a



    I'm testing upgrade paths to vSphere from ESX 3.5 update 4 on a IBM BladeCenter with 2 hs21 xm blade servers; i'm going through several problems and the last one is the one mentioned in the subject of this thread.

    In this scenario i upgraded vCenter succesfully then i took all VMs on a single esx 3.5 host, i removed the other host from the cluster and then from vCenter, then i made a fresh install of vSphere, i reconnected the host to the cluster then i did the whole procedure again with the second node. At the end i have two hosts with vSphere installed but i had to disable HA in my cluster since i always get this error when i try to configure HA agents on the hosts, but i have to say that DRS works ok.

    In the release notes of vSphere, in the known issues section i can read:

    "Upgrading from an ESX/ESXi 3.x host to an ESX/ESXi 4.0 host results in a successful upgrade, but VMware HA reconfiguration might fail

    When you use vCenter Update Manager 4.0 to upgrade an ESX/ESXi 3.x host to ESX/ESXi 4.0, if the host is part of an HA or DRS cluster, the upgrade succeeds and the host is reconnected to vCenter Server, but HA reconfiguration might fail. The following error message displays on the host Summary tab: HA agent has an error : cmd addnode failed for primary node: Internal AAM Error - agent could not start. : Unknown HA error .

    Workaround: Manually reconfigure HA by right-clicking the host and selecting Reconfigure for VMware HA."

    The problem is that this workaround doesn't work for me, so i was wondering if someone, once again is able to help me with this issue.

    Thanks in advance for your support.

  • RELEVANCY SCORE 3.04

    DB:3.04:Ha Isolation Address Error 8f


    I am receiving the following error when I enable HA in our cluster:

    vSphere HA agent on this host could not reach isolation address 180.16.0.1,,

    Is there an advanced setting that needs to be updated?

    thanks

    the environment is defined as:

    vSphere 5 (ESXi and vCenter)
    MD3200i SAN
    2 Dell R710s
    No DNS or DHCP

    ESXi 1 Console
    180.16.0.100
    180.16.0.1 (gateway)

    ESXi 2 Console
    180.16.0.110
    180.16.0.1 (gateway)

    vCenter Server
    180.16.0.200
    180.16.0.1 (gateway)

    ESXi 1 VMotion
    192.168.1.1
    180.16.0.1 (gateway)

    ESXi 2 VMotion
    192.168.1.2
    180.16.0.1 (gateway)



    DB:3.04:Ha Isolation Address Error 8f


    from what I can tell, 180.16.0.1 is not a router

    i can set up a router on a VM if necessary?

    is it possible that since I have the vMotion set to a different net, 192.x.x.x.x is creating the isoation error ?

  • RELEVANCY SCORE 3.04

    DB:3.04:Ha Agent Has An Error (Timeout) - Not A Dns Issue kj



    Hello.

    I experienced a problem while adding my first two newly-installed ESXi hosts to a VC Cluster (HA+DRS).

    I add with no problems the first one.. HA configures well and with no errors.

    When I try to add the second host, the HA Agent configuration hangs at 86% and after three minutes it goes into timeout returning a message which says that "HA Agent has an error". The error detail also says that configuration failed because the command ft_startup has not been completen within 3 minutes.

    I already found other thread and posts regarding this issue but on ESX, not ESXi.

    They reports DNS problems.. but I haven't any DNS issue because I am able to ping all hosts from the ESXi hidden console with "short" hostname and also with the complete FQDN (hostname.domain). Reverse DNS is also OK (only FDQN, obviously). Hostnames are all OK in /etc/hosts and are all lowercase.

    Some information:

    Domain name: itt.ferrovienord

    DNS: 10.2.10.111 (dhcp-dns.itt.ferrovienord)

    VirtualCenter: 10.2.10.100 (virtualcenter.itt.ferrovienord)

    ESXi #0: 10.2.30.200 (saturn.itt.ferrovienord)

    ESXi #1: 10.2.30.201 (venus.itt.ferrovienord)

    Thank you so much for your help!!

    Alessandro

    DB:3.04:Ha Agent Has An Error (Timeout) - Not A Dns Issue kj

    Any resolution on this?

    Having the same issue with U3 on ESXi.. HA will not configure

  • RELEVANCY SCORE 3.03

    DB:3.03:Hardware Host Frozen 8m



    Hi Guys,

    I have a VM cluster with 8 hardware ESXi 5.5 hosts, today one of my hardware host (host1) has frozen. Checked the other servers in the following message: The vSphere HA agent on this host can not reach some of the management network addresses of other hosts, and HA may not be able to restart VMs if a host failure Occurs: 1 host. There was nothing to do, I performed the reboot of the blade, when the host hardware again respond within the vsphere surpressa another host for my hardware stopped responding. Each time I did reboot a blade another stop. Checked in Vcops informing lack of resources to fail over. Activated the DRS in fully automatic, the environment is currently stable. smoothly falling hardware hosts.

    Thanks for helping!

    DB:3.03:Hardware Host Frozen 8m


    Hi Guys,

    I have a VM cluster with 8 hardware ESXi 5.5 hosts, today one of my hardware host (host1) has frozen. Checked the other servers in the following message: The vSphere HA agent on this host can not reach some of the management network addresses of other hosts, and HA may not be able to restart VMs if a host failure Occurs: 1 host. There was nothing to do, I performed the reboot of the blade, when the host hardware again respond within the vsphere surpressa another host for my hardware stopped responding. Each time I did reboot a blade another stop. Checked in Vcops informing lack of resources to fail over. Activated the DRS in fully automatic, the environment is currently stable. smoothly falling hardware hosts.

    Thanks for helping!

  • RELEVANCY SCORE 3.03

    DB:3.03:Module Ha Has Been Disabled On All Hosts 88



    Hi all,

    can someone help me to understand what happens and how to fix the problem? All VM crashed and I have to reboot all hosts.

    Before all hosts and the VC are pingable with FQDN and shortname.

    Hosts file has all FQDN host.

    The hostname has some capital letters could be a problem?

    Many thanks

    Daniele

    here what happens on VI with 3 hosts:

    at 4.56 ESX03:

    -HA has been disabled

    -HA agent disabled

    -HA agent on server ESX03 has an error

    -Error detected on ESX03, :createuser failed cmd createuser failed internal error

    -Unable to contact a primary HA agent

    -Insufficet resources to satisfy HA failover on cluster

    at 4.56 ESX02:

    -HA is beeing disabled on ESX02

    -HA agent disabled

    -Error detected on ESX02 no primaryagent: Could not find a primary host to configure DAS on ESX02

    -HA agent on ESX02 has an error

    at 4.56 ESX01 (primary agent)

    -HA is beeing disabled on ESX01

    -HA agent disabled

    -Enabling HA agent on host ESX01, cluster configured correctly

    -Sufficient resources are available to satisfy HA failover in cluster

    DB:3.03:Module Ha Has Been Disabled On All Hosts 88


    When you do an mgmt-vmware restart, the vpxa agent (vmware-vpxa) would get restarted as well.

    When you restart VC, it will have to reconnect back with vpxa with the credentials, to get it going again, and do a sync between them. During this period, vpxa does not know the state of HA on the host, and starts with the assumption that its disabled. During the sync process, it gets the right information, and also starts monitoring HA.

    So during this brief period, you will see that HA is disabled, but once the sync is done, you'll see that HA has been running ok on this host.

  • RELEVANCY SCORE 3.03

    DB:3.03:Ha Not Working In Mixed Esx/Esxi Cluster pa



    Hi,

    I just added a new host to a cluster. The host is a HP DL380 G7 running ESXi 4.0. The other 3 hosts in the cluster are running ESX 4.0 on hardware HP DL 380 G6.

    When I try to configure HA I get the following message:

    Cannot complete the configuration of the HA agent on the host.

    Is there anything I need to do to in order to configure an ESXi host in a cluster that is running only ESX hosts?

    Thanks in advance guys.

    DB:3.03:Ha Not Working In Mixed Esx/Esxi Cluster pa


    Thanks for the quick replies guys. That sorted out my problem. Much appreciated.

  • RELEVANCY SCORE 3.02

    DB:3.02:How To Remove A Node From Ha Cluster? 8s



    Hello,

    I have hopefully what is a simple question. I have an 8-node ESXi HA configured cluster in vCenter and I would like to make one of the 2 following changes if I can.

    1. Remove only 1 of the ESXi servers from the cluster. Can this be done?

    ----IF NOT----

    2. Can I prevent 1 particular ESXi host in the cluster from performing any HA related processes, such as migrating VM's to or from itself?

    I ask these questions because I want to have someone use one particular ESXi host but I do not want VM's migrating to the server or the ones that I place on the server from the beginning migrating off.

    So can either of my above questions/statements be accomplished?

    DB:3.02:How To Remove A Node From Ha Cluster? 8s


    1. Remove only 1 of the ESXi servers from the cluster. Can this be done?

    Yes, my recommendation is place this host in Maintenance Mode and move to outside of cluster... take a look at documendation: Remove a Host from a Cluster

    2. Can I prevent 1 particular ESXi host in the cluster from performing any HA related processes, such as migrating VM's to or from itself?

    Not 100% if I understand you, but yes, you can configure Admission Control and select the policy "Specify a Failover Host", take a look:

    Specify a failover host

    With the Specify a Failover Host admission control policy, when a host fails, HA will attempt to restart all virtual machines on the designated failover host. The designated failover host is essentially a hot standby. In other words DRS will not migrate VMs to this host when resources are scarce or the cluster is imbalanced. Please note that when selecting this admission control policy it is by no means a guarantee that when a failure occurs all VMs that will need to be restarted actually are restarted on this host! If for whatever reason a restart fails or not enough resource are available the VM will be restarted on a different host!

  • RELEVANCY SCORE 3.02

    DB:3.02:Esxi 3.5 In Production Cluster 1f



    hi

    we are thinking of migrating our clusters of esx 3.5 to esxi 3.5...basically due to faster updates..no service console..and others known reaseons..

    but when i've started to migrate i've noticed that to enable HA you have to make some tricks..

    so i'm asking...is esx3i ready for production? ..i mean ..a product made to be in a cluster should have the HA ..how can i say.."native"..and not enabled by triks or manual configuration.

    the error i've received was:

    HA agent has an error : Host in HA Cluster must have userworld swap enabled

    and to solve it i have to enable ScratchConfig.ConfiguredScratchLocation and ScratchConfig.ConfiguredSwapState in the advanced configuration.

    is this normal ?

    the HA shouldn't be "Native"..and not to be enabled with advanced configuration?

    what do you think about it?

    thank you

    DB:3.02:Esxi 3.5 In Production Cluster 1f


    Have you tried just..

    /vmfs/volumes/localdatastorename/

  • RELEVANCY SCORE 3.02

    DB:3.02:Error On Reconfigure Ha sm



    Hello,

    We've an issue with a ESXi 3.5 Cluster with 6 nodes.

    One node had an issue with HA agent, and we have an error on the Reconfigure for HA tasks :

    HA agent has an error : cmd addnode failed for primary node: /opt/vmware/aam/bin/ft_startup failed to complete within 3 minutes : Unknown HA error

    DB:3.02:Error On Reconfigure Ha sm


    Remove all the hosts from the cluster and add them one by one.

    If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Mohammad Wasim

  • RELEVANCY SCORE 3.02

    DB:3.02:Misconfiguration In The Host Network Setup 1c



    Hello All,

    Basically, I have two esxi hosts 192.168.1.5 192.168.1.4, I enabled DRS HA, the vmotion working fine before I move hosts to the cluster, but once I added them to the cluster I faced the following error. The gateway and NTP are same, and I configured stranded and distributed switches, I am still facing the same error.

    Error: HA agent cannot install on esxi host. The error message is:

    Reconfigure HA host

    192.168.1.5

    Cannot complete the

    configuration of the HA

    agent on the host.

    Misconfiguration in the

    host network setup.

    Could you please advice?

    Thanks.

    DB:3.02:Misconfiguration In The Host Network Setup 1c


    Be sure that each host has a hostname and a DNS record A.

    Check that networking is configured in the same way on each host.

    See also this KB:

    http://kb.vmware.com/kb/1001596

    Andre

  • RELEVANCY SCORE 3.01

    DB:3.01:Ha Agent Diabled Or Error On Esxi U3 m8



    I have a labe of several ESXi update 3 servers running with VC 2.5 U3 and when I create a cluster for HA and DRS, on a couple of the servers I get either a

    "HA agent disabled on lt;hostgt; in cluster lt;clustgt; in lt;vdcgt;"

    or

    "HA agent on lt;hostgt; in cluster lt;clustgt; has an error"

    I have checked DNS and all hosts are defined and reachable. Reverse DNS is also defined correctly.

    All hosts can see the SAN LUNs and have all of the same network lables defined on them.

    Can any one provide me with some ideas on how to go about taking care of this problem?

    Thank you,

    DB:3.01:Ha Agent Diabled Or Error On Esxi U3 m8


    After upgrading to Build 130755, everything stabilized.

  • RELEVANCY SCORE 3.01

    DB:3.01:Ha Agent Error ap



    I have to host in a cluster. At one point I had HA and DRS enabled for the cluster. I now only have DRS enabled. However, one of the hosts in the cluster is still getting a "HA agent on......in cluster.....has an error".

    Trying to get rid of error message.

    DB:3.01:Ha Agent Error ap


    Same problem here, but my resolution was a bit different. Using ESX 3.5, DNS was setup correctly, but the time wasn't. Synced all ESX servers, and VC server with a local NTP server and HA errors disappeared.

  • RELEVANCY SCORE 3.01

    DB:3.01:In Ha After Host Failure Virtual Machine Grayed Out. ps



    Hi,

    Iam doing Evaluation on HA.

    I have created two ESXI host inside the cluster both HA and DRS are enabled on the cluster.i have put DRS on fully automated mode.once an ESXI host failed virtual machine get powered off and getting powered on.

    But the issue is this virtual machine is grayed out and i cant do any thng on that virtual machine which is under the failed ESXI host.

    Any help will be greatly appreciated

  • RELEVANCY SCORE 3.01

    DB:3.01:"An Error Occured During Configuration Of The Ha Agent On The Host" - On Some Hosts kj



    I'm attempting to configure VMware HA on a cluster of 3 ESXi hosts. The hosts are configured with VMware ESXi Server 3i 3.5.0 1133388.

    When attempting to configure HA I get the error: "An error occured during configuration of the HA Agent on the host" but in the tasks and events pane for each host there is no further information. The thing that's really confusing me is that the hosts are configured identically and yet this occurs on 2 of the hosts but one of them reports no errors.

    The ESXi hosts were added to a cluster through IP address rather than name (there is no DNS server in the isolated management network)

    Each host uses itself as its preferred DNS server

    The VC is used as the VMkernel default gateway

    Has anyone encountered similar problems with this build?

    DB:3.01:"An Error Occured During Configuration Of The Ha Agent On The Host" - On Some Hosts kj


    Thanks, that's getting me in the right direction. I seem to have a new problem, one of the hosts reports the following error:

    "Configuration of host IP address is inconsistent on host :address resolved to XXX.XXX.103.1"

    XXX.XXX.103.1 is the IP address used for a VMkernel port group reserved for VMotion and IP Storage

    XXX.XXX.102.160 is the IP address of the management network

    Both addresses use a subnet mask of 255.255.252.0

    I've checked the hosts file and it is using XXX.XXX.102.160

    HA was set to use the das.allowNetwork0="Management Network" option to prevent it using the storage/vmotion network

    Note - I've removed some of the information as this is going to be installed in a live environment and don't want to expose too many details

    Any idea what is going on?

  • RELEVANCY SCORE 3.01

    DB:3.01:Cannot Complete The Ha Configuration On Host Upgrade To 4.1 And Patching d1



    Hi All -

    I'm baaack.. OK - my upgrade from 4.0 to 4.1 was going outstanding until I got to my 5th host (out of 20) on my dev cluster. VUM reported successful upgrade, I then proceeded to patch with the latest and greatest from the repository. Installed Open Manage for ESXi (as I had in the previous host upgrades), rebooted, took out of maintenance mode (took 20 minutes) and all my hosts in my dev cluster went red (exclaimation mark). The last host I was working on had an error that contained ""cmd addnode failed for primary node: Internal AAM error", the others have "Error while running health check script".

    I am focusing on the host that I updated last since I have no guests running on it. Here is what I have tried so far..

    1) Removed from vCenter, re-added to original cluster (cluster inside the root cluster), then took out of maintenance mode = no luck

    2) Reconfigured HA = no luck

    3) Removed from vcenter, added to the root cluster, then took out of maintenance mode = SUCCESS

    Now when I try to add that host back into its original cluster, the same issue as above happens on configuring/enabling HA (Admitting host into cluster step). "Cannot complete the configuration of the HA agent on the host. Unable to contact a primary HA agent." So this tells me something must be wrong with my cluster all of a sudden or is it coincidence?

    Anyone see anything like this before? If so, how did you take care of it?

    Thanks.

    DB:3.01:Cannot Complete The Ha Configuration On Host Upgrade To 4.1 And Patching d1


    wow.. OK - All I need to do is post on here and then whallah! I find the answer..

    http://communities.vmware.com/message/1578491#1578491

    Once I disabled VMware HA on the cluster and re-enabled it, everything came back on line... Goofiness. Hope this helps someone else that may come up against this.

  • RELEVANCY SCORE 3.00

    DB:3.00:After Moved A Existed Host To Ha Cluster, Got Misconfiguration In The Host Network Setup cp



    An existed Host worked well in datacenter, which only had one NIC card. But moved it to HA Cluster, got

    Configuring HA

    10.1.1.203

    Cannot complete the configuration of the

    HA agent on the host. See the task details

    for additional information. Misconfiguration

    in the host network setup.

    Misconfiguration

    in the host network setup.

    Must it have more NIC card for HA? But, on procedure of creating HA, no this requirements.

    Thanks!

    George

  • RELEVANCY SCORE 2.99

    DB:2.99:To Add A Ad Account To Read-Only Role On Esxi Server az



    i have a AD user account "nam\test" need to add to "read-only" role on each esxi host exists in the vCenter.

    DB:2.99:To Add A Ad Account To Read-Only Role On Esxi Server az


    Then you are probably encountering the bug reported in New-VIPermission bug still exists in PowerCLI 5.0?

    Can you try the workaround based on the SetEntityPermissions method from that post ?

  • RELEVANCY SCORE 2.98

    DB:2.98:Equal Hosts In A Ha Cluster? jk


    Using ESX 3.5 Update 3 on two hosts in a cluster and I am trying to activate HA. and having problems.

    when I activate HA (only), one of the host sindicates that the HA agent was configured without and problems. On the other host I get a message that the HA agent was configured OK but was disabled.

    Between the hosts, all the networking is identical, all the host names are in lower case. Both hosts are connected to the same LUNs.

    For HA is there a requirement that all host have identical memory, CPUs, and CPU speed?

    I have re-created the cluster and moved both hosts into the cluster. and I am at a lost to why HA is being disabled.

    any thoughts? Thanks for any and all help.

    DB:2.98:Equal Hosts In A Ha Cluster? jk

    Kyle...

    Thanks for the reply. I did not think I had to have the same hardware for HA but thought I would just make real sure.

    Host failures set to 1. "prevent VMs from being poser on..." is selected.

    Russ

  • RELEVANCY SCORE 2.98

    DB:2.98:Ha Error: Vsphere Ha Agent On This Host Cannot Reach Some Of The Management Network Addresses Of Other Hosts a8



    Recently we enabled HA on 3 node cluster, receiving Warning on host summary tab as vSphere HA agent on this Host cannot reach some of the management network addresses of other Hosts. we have cross verified FQDN names of esxi hosts, vmkping between hosts and services.sh restart, followed KB/blogs no solution yet. Still we are not able to fix the warning, can anyone help me out with detailed troubleshooting navigation?

    DB:2.98:Ha Error: Vsphere Ha Agent On This Host Cannot Reach Some Of The Management Network Addresses Of Other Hosts a8


    I assume that means it's still not working!?

    The next step is to check the network configuration of the hosts to ensure there's nothing wrong there (e.g. a typo in a domain name, DNS server address or subnet mask, ...) Please double check the settings for the "Management Network" port group as well as the "DNS and Routing" settings in the "Configuration" tab for all of the hosts.

    Andr

  • RELEVANCY SCORE 2.98

    DB:2.98:Ups Shutdown In Vmware Ha Cluster jk


    Hi all,

    I am tasked to study UPS signal to shutdown guest OS then ESXi host in HA Cluster, should it have any reference? Thanks

    Regards

    DB:2.98:Ups Shutdown In Vmware Ha Cluster jk


    Hello,

    check this:

    How to configure ESXi to shutdown using an APC SmartUPS (with lamw scripts)

    Maybe it will be helpful

    Best regards,

    Pablo

  • RELEVANCY SCORE 2.97

    DB:2.97:Enable Ha? mm



    Hi,

    When I enable HA I get a host of messages.

    "Cannot complete the configuration of the HA agent on the host. See task details for additional information. Other HA configuration error.

    Related Events:

    Unable to contact a primary HA agent cluster X in X

    HA agent on 192.168.X.X in cluster X in X has an error:

    cmd addnode failed for secondary node: Internal AAM Error - agent could not start.: Unknown HA error

    HA agent on 192.168.X.X in cluster X in X has an error: Cannot complete the HA configuration.

    DB:2.97:Enable Ha? mm


    For getting FT to work for VM's i highly recomend downloading and running the Site Survey utility located here:

    http://www.vmware.com/download/shared_utilities.html

    Install it on your VC box, select the cluster you want to scan, and you will then see a tab named "Site Survey". Once you run it, it will let you know if you meet all the requirements on that cluster for VM FT and will scan each VM and show which ones meet the requirements. Great utility.

  • RELEVANCY SCORE 2.97

    DB:2.97:Ha Issue k8


    Cluster consist of 4 host. two of the host get An error occured during configuration of the HA agent on the host. Both host can ping the other host in the cluster by FQDN and short name

    DB:2.97:Ha Issue k8

    It will work with or without the redundant SC management network. Provided that the single path does not get disconnected or fail some how.

    It is a better to have a team across two physical paths, but this is not always an option for some installs.

    Just to clarify that you are not seeing some other issue, what I am discussing is specifically as follows.

    "Host lt;namegt; currently has no management network redundancy"

    Which is a warning state more than an error.

  • RELEVANCY SCORE 2.97

    DB:2.97:Cannot Install The Vcenter Agent Service. Unknown Installer Error ac


    Hi,

    I have installed ESXi 4 into two pcs and i also installed vCenter in another pcs. Earlier i added that two hosts (by using IP address) into vCenter and no problem found in adding hosts into vcenter.

    And then for HA doing workable i needed to remove cluster (hosts were added to that cluster) and again when i had gone to added the hosts got an error "Cannot install the vCenter agent service, Unknown installer erro" detail steps mentioned below:

    i hv removed the cluster from the vcenter - no problems found

    i have added cluster with enabling HA and DRS - no problems found

    i have added a host vm1.gtlbd.com ( ip 114.130.4.201) - it's added and give a warning at cluster which is "insufficient resources to satisfy HA failover level" - seems another host vm2.gtlbd.com needed to be added.

    now i have added another host vm2.gtlbd.com (114.130.4.202) into that cluster and gives an error which is "cannot install vcenter agent on vm2.gtlbd.com, unknown error" - a screen shot is also attached hereby. urgently need a feedback please.

    Ref. of my changes: http://communities.vmware.com/message/1445360#1445360

    can anybody help me what is the way of resolution

    regards..

    Apurba

    DB:2.97:Cannot Install The Vcenter Agent Service. Unknown Installer Error ac


    check for the hostd service on esxi host and restart it also remove the esxi host from vc an d connect it back

  • RELEVANCY SCORE 2.97

    DB:2.97:Ha Agent Showing Failed On One Esx Host In A Cluster And One Vm Is Not Responding. Unable To Migrate. md



    Hi All,

    We are receiving HA error on one of our esxi host in a cluster of 4 servers. Its showing error that HA is not working, conenction timed out.

    All VM's hosted are showing in disconnected status but working fine, however there is one VM which is not responding. How can I make it working as migration is also not working on it.

    DB:2.97:Ha Agent Showing Failed On One Esx Host In A Cluster And One Vm Is Not Responding. Unable To Migrate. md


    check syslog and hostd logs on your host.

    you can browse the logs via https://esxi ip address/host , and you should be able to get them via the VI client, or DCUI console.

    The 2 logs above should help you on your way but here are what some of the logs do:

    http://sparrowangelstechnology.blogspot.com/2012/07/what-to-dcui-console-logs-show.html

  • RELEVANCY SCORE 2.96

    DB:2.96:Esxi Host Failing To Be A Memeber In A Ha Cluster pc


    Hi, when I try to move an ESXi 3.5 server into a Ha cluster on an VC server it fails. Following error message can be seen in the Tasks Events flip.

    Command 'hostname -s' on host (ip address on host) failed or returned incorrect name format.

    How can I correct this?

    DB:2.96:Esxi Host Failing To Be A Memeber In A Ha Cluster pc

    Hi, I am just testing out VMware in a test lab. I have not renamed the ESXi host.

    I registered the host with the ip address in VC

    I have got 2 ESXi hosts, one registeres fine in HA , but the otherone only gets the error message posted orginially.

    I have removed the Host from the VC and then added the host again, but the same error occures.

    Is there a simpel way to get this working? I am would realy like to start testing HA functionality.

    As I mentioned earlier this is for the moment just in a test lab, so I am ready to try every god suggestion:-)

  • RELEVANCY SCORE 2.96

    DB:2.96:Ha Error During Installation zx


    Hi, at first i wanna say sorry for my bad english but i will do my best.

    I have an Bladecenter S with 4 Blade Modules, 2 SAS Raid Switche, 2 Cisco Switches and 2 ext. HDD Storages.

    I installed ESXi 3.5 bulid 207095 (or so) on my Blade Modules

    All Module can Ping each other (with ip-adress, FQDN, and short name)

    On one of my ESXi 3.5 i have installed a virtual win server 2003 with dns.

    The Server is my DNS and the client for my CenterServer.

    Then i wanted to configure the Datacenter, Cluster and HA with my Infastructure Client

    At first wenn i wanted to add HA i got the "HA agent has an error : Host in

    HA Cluster must have userworld swap enabled" error

    I solved it with this site .

    Then i got this error " HA agent failed to remove ......bla bla something" i solved that error to.

    Now i got an error that called " cmd addnode failed for secondary node: /opt/vmware/aam/bin/ft_startup failed to complete within three minutes "

    I installed DNS, wrote the Host file on my Blade Modules (ESXi3.5) with the[hidden CLI.|http://www.virtualizationteam.com/virtualization-vmware/virtualization-vmware/virtualization-vmware/vmware-esxi/vmware-esxi-console-access-unsupported.html]

    Than i tried this and nothing worked for my last problem....

    I dont know waht to do.

    Pls help me.

    DB:2.96:Ha Error During Installation zx

    Hi thanks for your Answer but i have Bulid number ESXi 3.5 bulid 207095on my Blade... do you think that a Bulid "158874" wouldt help?

  • RELEVANCY SCORE 2.96

    DB:2.96:Cannot Complete The Configuration Of The Ha Agent On The Host s3



    one of my host in the cluster is failng on HA. error is

    Cannot complete the configuration of hr HA agent on the host.Misconfiguration in the host network setup.

    but nothing was changed on the network and dns settings are fine.

    tried restarting services.sh and will not start vmware-aam

    any idea?

    DB:2.96:Cannot Complete The Configuration Of The Ha Agent On The Host s3


    weird. both vmkernel nics was checked for management. i unchecked one of them and reconfigured HA. its ok now

  • RELEVANCY SCORE 2.96

    DB:2.96:Vsphere Ha Unsuccessfully Failed Over Vm 3z



    Hi,

    I have a ESXi 5 two Host cluster with HA enabled. There is no VM monitoring enabled. One VM had a problem and I got the following events -

    vSphere HA unsuccessfully failed over VM Name on Host in cluster. vSphere HA will retry if the maximum number of attempts has not been exceeded.

    The Host did not fail and no other VM's are affected.

    Thanks,

    DB:2.96:Vsphere Ha Unsuccessfully Failed Over Vm 3z


    Hi,

    Thanks very much. I will open a case with VMware.

    Cheers,

  • RELEVANCY SCORE 2.96

    DB:2.96:Esxi 5.0 U3 Ha Problem m7



    Hi all,

    I'm hoping for some advice here.

    To cut a long story short, I had to uninstall vCenter 5.0 and install 5.0 Update 3. This all went through fine but 1 of our 9 hosts is having trouble initialising HA. It will stall at installing the HA agent.

    I have rebooted the host, removed from the cluster and added back in and tried a number of solutions on the net such as, stopping HA for the cluster and starting the service on the host before the whole cluster etc.

    I really think I've tried everything suggested on the net and that my only solution is to re-install ESXi on the host.

    One thing that might help is that I also cannot scan the host for updates through the update manager. All others are fine. I've checked the firewall settings but cannot see any difference to the working hosts.

    Any help will be greatly appreciated.

    DB:2.96:Esxi 5.0 U3 Ha Problem m7


    Yeah, removed it completely from vCenter and added back in. Same issue. I feel that I've tried everything suggested on the net and that a reinstall of ESXi is the only option really.

  • RELEVANCY SCORE 2.96

    DB:2.96:Ha/Drs Cluster aa


    3 Identical ESXi Hosts in a DRS/HA Cluster

    1 Host will not enable HA agent succesfully.

    No problem if DRS only.

    ESXi 4 Update 2

    any ideas?

    DB:2.96:Ha/Drs Cluster aa


    Maybe this KB will help

    http://kb.vmware.com/kb/1003735

    If you found this or any other post helpful please consider the use of the Helpful/Correct buttons to award points

  • RELEVANCY SCORE 2.95

    DB:2.95:Ha Agent Error m9


    Hi !

    2x ESX 3.5 build 98103

    VC 2.5.0 build 64192

    Shared (LUN) SAN FC datastore

    I try to create HA cluster.

    When i add second ESX host, i get error "HA agent on 'esx host ip' in cluster hac in New Datacenter has an error"

    This error appears exactly at addition of the second ESX host.

    DB:2.95:Ha Agent Error m9

    Tom,

    Apparently the issue was resolved by increasing the console memory.

    Thank you for valuable response. But I have more questions coming up. I

    want to know that does it goona impact as to how I have added the host to the

    VC?

  • RELEVANCY SCORE 2.95

    DB:2.95:Vsphere Ha Unsuccessfully Failed Over. Operation Is Not Allowed In Current State! ? df



    Hello,

    I am testing vSphere HA with two ESXi hosts in my lab. I have one VM on each ESXi host and when I try to initiate HA by shutting down the switch port connection to one of the hosts, the vSphere cluster tries to failover the VM on the failed host to the other host, but is not successful.

    The events under the Cluster gives this warning:

    vSphere HA unsuccessfully failed over virtual-machine on host in cluster cluster name. vSphere will retry if the max number of attempts has not been exceeded. Reason: The operation is not allowed in the current state.

    Attached is the screenshot of the error.

    I tried restarting the vCenter server service and that did not help. What could be wrong here?

    Thank you.

    Shivani

    DB:2.95:Vsphere Ha Unsuccessfully Failed Over. Operation Is Not Allowed In Current State! ? df

    Hi Shivani,

    The behavior depends on the datastore type that VM resides - is it a FC or network backed datastore? FC/NFS/iSCSI ? One possible reason it didn't work could be as follows: (I am speculating here given the above details, so bear with me if I misstate something)
    - HA master tries to failover the VM (since in this case the slave is considered dead as there is no HB datastores during n/w isolation)
    - If the VM resides in a FC based datastore, then HA master cannot failover (i.e. register and powerOn the VM on master's host in this case) the VM since the lock was still held by the other host (Note, VM is still powered on)

    Thanks for bringing this up - we will also clarify this in the KB article.

  • RELEVANCY SCORE 2.95

    DB:2.95:Error: Adding New Vmhost To New Cluster With Ha And Drs z3



    I just finished installing ESX 3.0.1 and added it to my Virtual Management Center. I then created a Cluster, and added this host to it.

    I received the following errors:

    Insufficient resources to satisfy HA failover level on cluster 'Cluster 1' in VM's

    Unable to contact a primary HA agent in cluster 'Cluster 1' in VM's

    The is only one other host being managed by this Virtual Center Management server, and it is currently stand-alone, not in a cluster.

    What is going on?

    DB:2.95:Error: Adding New Vmhost To New Cluster With Ha And Drs z3


    If the answer was helpful, at least offer the points... not much, its all we get...

  • RELEVANCY SCORE 2.95

    DB:2.95:Ha Agent Has An Error Incompatible Ha Networks 9c


    HA agent has an error Incompatible HA Networks: host has network that don't exist on cluster members: 10.10.2.220: Cluster has network missing on host: 10.10.1.220

    DB:2.95:Ha Agent Has An Error Incompatible Ha Networks 9c

    HA agent has an error Incompatible HA Networks: host has network that don't exist on cluster members: 10.10.2.220: Cluster has network missing on host: 10.10.1.220

  • RELEVANCY SCORE 2.95

    DB:2.95:Problem With Adding Second Node To Ha 97


    Hi All,

    I have ESX 3.5 evaluation version. I already setup HA on first node, when I try add second node VI show me on summary tab on this machine red question mark and there is a note:

    - An error occurred during configuration the HA Agent on host

    - HA agent on 192..... in cluster in ... has an error.

    and there is an error on group HA:

    - Insufficient resources to satisfy HA failover on cluster

    I understand the 'Insufficient resources to satisfy HA failover on cluster' but what could be a reason of two first errors?

    Regards,

    PAwel

    DB:2.95:Problem With Adding Second Node To Ha 97

    Hi

    many thankf for John(retonj) and weinstein5,

    that was the issue. After updating /etc/hosts and A record on DNS, there is no problem with HA nodes.

    Thank You

    Pawel

  • RELEVANCY SCORE 2.94

    DB:2.94:Need To Stop Ha During San Maintenance! 1s



    Hi all,

    we have vcenter 4.1 standard edition.I configured HA on four node cluster.I have planned to do maintenance activity on SAN storage array.If I shutdown ESXi host one by one, HA migrate the vm servers to next available host. But,this is the plannned maintenance. I need HA will not work during the maintenance activity.what i need to do?can i stop the vcenter server service? or I need to disable host monitoring? please suggest me.and please find my attachments. my HA configuration whether perfect or not.

    Note:

    All ESXi Hosts have same configuration in four node cluster.

    DB:2.94:Need To Stop Ha During San Maintenance! 1s


    Hi

    Disable HA and DRS and then try shuting down your hosts

  • RELEVANCY SCORE 2.94

    DB:2.94:Disabling Lro On The Host Breaks The Ha Agent? fx



    Following these directions from Cisco to disable LRO on the ESXi host appears to prevent the HA agent from enabling on that host. When these changes are undone, the HA agent successfully enables as expected. Obviously, this is not a desired state.

    Is LRO on the host required for HA?

    Are these directions faulty?

    Ref:

    http://docwiki.cisco.com/wiki/Disable_LRO

    Your help is appreciated!

  • RELEVANCY SCORE 2.94

    DB:2.94:Unable To Get Ha To Work mc



    I have three hosts, all of identical hardware and ESXi versions (4.1). The networks are configured the same on each host and they can all ping each other with their FQDN. But, whenever I try to enable HA in my cluster I always recieve the error "Cannot complete the configuration on the HA agent on the host. Misconfiguration in the host network network setup". All three host error out with this message. The host each have 2 4 port Broadcom nics and the 5 active nics are all configured on the same with each nic matching a different subnet and I only have one management int configured for vCenter. I have checked several KB's with no luck. Any advice would be appreciated.

    DB:2.94:Unable To Get Ha To Work mc


    lol... just typing this gave me the answer.. I changed the default gateway to be itself (same ip as int) and it is working. Thanks guys!

  • RELEVANCY SCORE 2.93

    DB:2.93:Problem With Ha cj



    I have a host that can't configure HA for some reason. It was fine up until today. I tried pulling it in and out of the cluster and right clicking and reconfiguring HA. Here are the events:

    HA agent on x.x.x.x in cluster cluster1 in datacenter1 has an error: Cannot complete the HA configuration

    then the next:

    HA agent has an error: cmd addnode failed for primary node: Internal AAM Error - agent could not start. "Unknown HA error"

    Any ideas why it's having these issues?

    DB:2.93:Problem With Ha cj


    Have you try reboot the host? At least I will to this before going any further.. If still cannot, try create new cluster and move this host to new cluster and enable HA for your new cluster. See how it goes..

  • RELEVANCY SCORE 2.93

    DB:2.93:Ha Agent On Host Could Not Reach Isolation Addrss jc



    three host esxi HA environment with shared storage.

    one hosts keeps throwing the following error.

    "vSphere HA agent on this host could not reach isolation address 192.168.0.x"

    can you help?

    DB:2.93:Ha Agent On Host Could Not Reach Isolation Addrss jc


    Seeing as the other hosts (assuming they are in the same subnet) can ping the isolation address (which is the default gateway for the management VMkernel as Andre points out), we can assume that the default gateway is an ICMP pingable device. Your best bet would be to check the specific host and see if you can ping the default gateway here. You can see what the default gateway is by running the command: esxcli network ip route ipv4 list, which would give you an output like this:

    ~ # esxcli network ip route ipv4 list
    Network Netmask Gateway Interface Source
    ----------- ------------- ----------- --------- ------
    default 0.0.0.0 192.168.1.1 vmk0 MANUAL
    192.168.1.0 255.255.255.0 0.0.0.0 vmk0 MANUAL
    192.168.2.0 255.255.255.0 0.0.0.0 vmk3 MANUAL
    192.168.3.0 255.255.255.0 0.0.0.0 vmk1 MANUAL
    ~ #

    The first line is your default gateway. That should be reachable for the hosts in a HA enabled cluster, as they ping this address to check if they have been isolated from the network.

  • RELEVANCY SCORE 2.93

    DB:2.93:Adding Esxi 4.1 Host To Esx Cluster ds



    Can I add a new ESXi 4.1 host to a ESX 4.0 cluster without the HA failing over the servers in the cluster to the new ESXi 4.1 host after it reconfigures the HA for the cluster?

    IF it does not is there any reason I cannot add the server to the cluster during the day without any problems?

    Thank you for your help with this.

    DB:2.93:Adding Esxi 4.1 Host To Esx Cluster ds


    Hi ranjb,

    Yes, you could create a new cluster object under your datacenter object and create another HA Cluster.

    Bear in mind that VMs are also VMotion compatible across clusters objects but not DataCenter objects.

    Rgds,

    J.

    My Company http://www.jmgvirtualconsulting.com

    My Blog http://www.josemariagonzalez.es

    My Virtualization Web TV Show http://www.virtualizacion.tv

  • RELEVANCY SCORE 2.93

    DB:2.93:Vm Startup Priority In Cluster? pk



    I know how to set the automatic / manual starup order for VMs on a single ESXi host. Suppose you have a fully virtualized VMware infrastructure: ESXi, vCenter, vMotion, HA, DRS, the whole package.

    How do you set certain VMs (like Domain Contollers) to power on first?

    DB:2.93:Vm Startup Priority In Cluster? pk


    Check out this blog entry by depping for some good info on how this works.

  • RELEVANCY SCORE 2.93

    DB:2.93:Esx4.1 And Esxi5.1 In The Same Ha/Drs Cluster 98



    Hello all,

    In the vSphere 5.1 release notes, it states: "vCenter Server 5.1 can manage ESXi 5.x hosts in the same cluster with ESX/ESXi 4.x hosts."

    We have already upgraded vCenter/Update Manager from 4.1 U2 to 5.1, and are now going to work on the ESX hosts. The plan is to take a host out of the cluster, re-install it from scratch with ESXi 5.1, and then put it back in the cluster. There's a dozen ESX 4.1 hosts in our current HA/DRS cluster. We'll do one at a time and in a week or so should be all at 5.1.

    I was wondering if anyone has run into any issues or has any comments on having both ESX 4.1 and ESXi 5.1 in the same cluster.

    Thanks for your help...

    DB:2.93:Esx4.1 And Esxi5.1 In The Same Ha/Drs Cluster 98


    Just to Add : Make sure you DONT upgrade the VMFS and VM Hardware version to the latest version until you complete the upgrade of all the ESXi host to 5.1

  • RELEVANCY SCORE 2.92

    DB:2.92:Ha Issue xk



    hi

    I have 1 cluster with 3 hosts and I have another empty cluster

    I am adding a new host no shared storage configured yet.

    My issue is when I add the new host to the cluster with existing hosts HA cannot be configured

    error:

    Cannot complete the configuration of the HA agent on the host. See the task details for additional information. Misconfigura-

    tion in the host network setup.

    but when I add my new host to the empty cluster I got no errors and hosts is properly configured

    what is the problem?

    thanks a lot

    DB:2.92:Ha Issue xk


    Ah, ok so it wasn't a subnet issue, just old DNS. Yeah, that's fine

  • RELEVANCY SCORE 2.92

    DB:2.92:Adding Esxi Hosts To Vsphere Client jp



    We have 7 ESX Servers and 8 ESXi Servers.

    Six of the ESX Servers are sitting in one cluster and the last one is in it's own cluster.

    It's nice to manage all 7 ESX Servers within one vSphere Client.

    The 8 ESXi Servers I have been managing each with their own vSphere Client connection.

    Is it possible to create an ESXi cluster within vSphere and add each of the ESXi hosts to that "cluster?" Or even one per "cluster?"

    I've tried and I get the following error.

    "License not available to perform the operation. The ESXi 4 Single Server license for Host 192.168.1.10 does not include vCenter agent for ESX Server. Upgrade the license."

    We don't have a budget to upgrade more liceneses. I'm not looking to do HA or DRS on these ESXi hosts. I would just like to be able to view the Servers running on them for inventory purposes without having to have 9 vSphere Clients open.

    We have a fully license vCenter Server so I would think we should be able to add these ESXi Servers.

    Any help would be appreciated.

    DB:2.92:Adding Esxi Hosts To Vsphere Client jp


    Not really. ESXi is slimmed down version without RHEL-based service console. Because of that VMware can give it for free.

    But actually all the functions like HA, vMotion etc are controlled by license (serial number). You can just enter serial number for vSphere Enterprise Plus on ESXi and you'll instantaneously get ALL the functions.

    ---

    MCSA, MCTS Hyper-V, VCP 3/4, VMware vExpert

    http://blog.vadmin.ru

  • RELEVANCY SCORE 2.92

    DB:2.92:Problem When Adding A Esxi 4.1 Server To An Existing Ha Cluster 19



    Dear all,

    I have an existing HA cluster working with two ESX 4.0 servers. Both ESX server configured to use its default Service console port for management traffic and i have defined a separated vSwitch with vKernel port specified for vMotion. Now we want to add a newly installed ESXi 4.1 server to the HA cluster, while adding, it prompts error "Cannot complete the configuration of the HA agent on the host. Other HA configuration error". The detail event is "cmd addnode failed for primary node: Internal AAM Error - agent count not start: Unknown HA error"

    I have tried to "Reconfigure for VMware HA", but the same outcome.

    Do anybody know what cause this problem? Is that related to networking?

    Thank you!

  • RELEVANCY SCORE 2.91

    DB:2.91:Configuracin Ha 8k


    Hola tengo un servidor ESX con las siguientes caractersticas:

    Dell PowerEdge 2950 con ESX y Dos quadcore 2,8, 16 gb ram y 6 puertos de red.

    Vi por Internet que se poda instalar en un HP ML110 G5 la versin Esxi como tena una licencia de sobra para gestionarlo de sobra lo compre rapidamente y as probaba HA.

    Este ML110 tiene 1 dual core 2,3, 5 nics y 7 gb ram. El problema es que cuando creo el cluster, muevo el ml110 sin ningn problema pero cuando introduzco el p2950 me da varios mensajes:

    Insufficient resources to satisfy ha failover lever on cluster

    Ha agent on 10.10.10.214 in cluster has an error

    Ha agent in cluster has an error incompatible HA Networks: Host has networks than don't exists

    Consider using Advanced cluster ....

    Supongo es que el ML110 no tiene la suficiente memoria y procesador para asumir las "probables" mquinas que estuvieran en el P2950. eso es as? Deberan tener el mismo n de procesador y memoria los servidors que van a estar en el cluster?

    Por otra parte he visto que tambin da un error de que no tienen la misma configuracin de red. Yo creo que si, os adjunto imagenes de todo. me podes echar una mano? En el Esxi no me deja crear service console...

    como podra arreglarlo para probar HA? A nivel de San todo los lun estan compartidas por ambas mquinas. hay que crear un disco de quorum y donde se indica o configura?

    Un saludo

    DB:2.91:Configuracin Ha 8k

    Hola ansator2,

    resumindolo muy resumido, para poder montar el cluster necesitas lo siguiente :

    Storage compartido (ok, ya lo tienes).

    Misma configuracin de red en ambos ESX, es decir, mismo vSwitches y port groups (mismos nombres). Lgicamente las IPs de consola sern distintas, etc ...

    Definir dos progroup de consola en cada uno de los nodos, en diferentes vSwitches y con difrentes up-links fsicos. Ambos nodos se tienen que ver a nivel de red entre sus consolas (esto es necesario para el trfico de heartbeat).

    Si quieres disponer de VMotion y DRS automtico adems del HA, necesitas adems portgroups de VMkernel en ambos nodos, en la misma red IP (o visibles) y con uplinks gigabit. Adems procesadores iguales (a nivel de marca, modelo, step), o bien equivalentes, o bien que puedas compatibilizar utilizando en flag NX, o bien mediante EVC (Enhanced VMotion Compatibility). Mrate esto : http://pubs.vmware.com/vi301/admin/wwhelp/wwhimpl/common/html/wwhelp.htm?context=adminfile=BSA_Migration.18.5.html

    ... creo que no me dejo nada

    Salu2

  • RELEVANCY SCORE 2.91

    DB:2.91:What Precaution Do I Need To Take While Upgrading Mixed Environment Cluster..!! dd



    Hello Every one,

    In my Vmware Environment I have a mixed ESX host cluster. All host are in one Bladecenter and they are using Shared Storage. In this mixed ESX environment i am trying to upgrade all ESX 4.0 host to ESXi4.1. Whole cluster is configured to HA and DRS.

    So please let me know what precaution do i need to take.

    Because i did one ESX host upgradation to ESXi 4.1 by keeping that host in maintenance mode and then through IBM bladecenter AMM module i install new ESXi 4.1 in to that Blade. when I add that ESXi host back to the mixed cluster.

    But that host gave me an red alert tht Error that HA agent on that ESX host in the cluster has an error: Error while running health check script.

    Most of my all servers are running in production environment so I don't want to make any mistake which will cost me a lot.

    So please let me know is there any rules or procedure that i need follow while upgrading the all ESX4.0 host to ESXi4.1 in that cluster.

    because I have found some information form the VMware Documentation and the forum that in mixed cluster environment

    1) I cant reboot the virtual machine.

    2) i can not upgrade the VMware tool .

    3) I can not create the new virtual machine on the ESXi host.

    So if i upgrade all host to ESXi 4.1 from that cluster and make all cluster host to ESXi 4.1 then can I do all things that I have mentioned in above point?

    Regards,

    Kapil

    DB:2.91:What Precaution Do I Need To Take While Upgrading Mixed Environment Cluster..!! dd


    Hi,

    Thank you for the quick response.

    I have upgraded the vCenter to 4.1 U1

    also i was able to install ESXi4.1 on one host.

    But after upgradation for configuration of that host I have added that host in to the same cluster where already HA and DRS is on i kept that host into maintenance mode for post configuration. I was able to succeed to install and configuration of the new ESXi4.1 host.

    But please let me know the HA and DRS of the cluster will make any affect to the new ESXi. Because right now in my environment into that cluster i have 5 ESX host and 2 ESXi and al VM on those ESX host are in production.

    What things do I need to take care till all ESX host upgradation to ESXi.

    Regads,

    Kapil

  • RELEVANCY SCORE 2.91

    DB:2.91:Help! Can Only Add First Host To Esxi Ha Cluster f8



    Hello,

    As the topic states, I am having an issue adding multiple hosts to an HA cluster in my test ESXi environment. To give you some quick background on the setup, we are starting to familiarize ourselves with Virtual Infrastructure as we will be consolodating our servers sometime in late Q1 2009. We have two whiteboxes which are running ESX 3.5i U3, another system which is running a trial version of Virtual Centre 2.5 U3, and another system running OpenFiler which is acting as a SAN for the ESXi hosts. As I already mentioned, we are just trying to familiarize ourselves with the setup and everyday tasks related to VI3.

    The problem I am having is in regards to HA. I can only successfully add one host to the HA cluster. When I attempt to add a second host to the HA cluster, I always receive an error. I've tried adding the hosts in the opposite order and have even reinstalled ESXi from scratch on one of the hosts. I've also tried the usual "Reconfigure HA" command as well as disabling HA on the cluster and then re-enabling it. The error I receive when adding the second ESXi host is as follows:

    An error occurred during the configuration of the HA Agent on the host:

    cmd addnode failed for primary node: /opt/vmware/aam/bin/ft_startup failed to complete within 3 minutes.

    I've searched the internet but am unable to find much resources which hint at solving this problem.

    I was hoping somone might be able to step up and point me into the right direction for having a working HA cluster.

    If you require any further information, kindly let me know and I will provide it.

    Thank you!

    DB:2.91:Help! Can Only Add First Host To Esxi Ha Cluster f8

    Hi,

    Please gothrough Mr. David referenced KB and do check the DNS name resolution of your ESXi servers. If Name resolution fails, then the HA wont work.

    I hope, this will be bit informative for you...

    Regards,

    Rajasekar

  • RELEVANCY SCORE 2.91

    DB:2.91:Will Vm Starts Up On Other Host (During Failure) If It Cannot Enter Maintenance Mode ? m1



    Hi,

    We are using a cluster with 2 ESXi 5.1 Hosts. When we attempt to place one ESXi host to Maintenance Mode, it failed and says "Insufficient Resources to satisfy configured Failover Level for HA".

    We would like to know, just in case of one of those ESXi Hosts failed, will VMs starts up on another ESXi Host (At least some) or all of them will be down (i.e. none of them will start up) ? If no VM will be migrated, we will change our Failover Level for HA.

    Thanks

    DB:2.91:Will Vm Starts Up On Other Host (During Failure) If It Cannot Enter Maintenance Mode ? m1


    This happens when you enable Admission control on your HA cluster. When the host failure occurs and you have the Admission control is enabled, it will not let all the machines power-on on the other host as it violates the availability constraints.

    If you disable this all the VM's will failover even though you don't have enough resources to accomadate all the VM's on one host. On a long run i would suggest that you have enough resources to tolerate this. But for now you can disable it and try.

  • RELEVANCY SCORE 2.91

    DB:2.91:Ha Agent In Cluster Has An Error 19


    I have five ESX 3.5 Hosts in a cluster. The VI client is 2.5 update 4

    One of the hosts lost connection to the VI client. I ran the following commands on the console:

    service mgmt-vmware restart

    service vmware-vpxa restart

    Every thing was OK.

    The host, however, still remained in the reconnecting.... state. I disconnected the host and tried to reconnect but then I got the message in VI

    HA Agent in cluster has an error ...

    How can I restart the HA agent from the console?

    DB:2.91:Ha Agent In Cluster Has An Error 19

    Try this..

    Log in to the service console of your problem hosts and verify that VMware HA is disabled using: service vmware-aam stop

    Ensure there are no VMware HA processes running by using: ps ax | grep aam | grep -v grep

    If processes exist, kill them using the Process ID returned by the previous command (first column) as the PID: kill -9 PID

    Issue the following command via the service console including the parenthesis: (cd /etc/opt/vmware/aam; mkdir .old; mv * .old; mv .[a-z]* .old)

    Using the Virtual Infrastructure Client click on the Host, then the Summary tab, and then Reconfigure for VMware HA. It should work ...

  • RELEVANCY SCORE 2.91

    DB:2.91:Ha Agent Error On A Simple Vmware Cluster s1



    we have 2 ESX 3.0.1 servers (with all of the latest security/critical patches) install in a simple DRS and HA cluster. We keep getting the following error

    HA agent on server2 in cluster "Production Cluster" in "DataCenter" has an error

    And then the host is displace as being in a disconnect state inside the VC.

    This error has accord when we try to test out the HA capability. With HA enable, Vmotion also doesn't seem to work. When we disable HA, but not DRS, everything is fine. Any idea?

    dwc

    DB:2.91:Ha Agent Error On A Simple Vmware Cluster s1


    Which log file are we looking at in there?

    All or some?

  • RELEVANCY SCORE 2.91

    DB:2.91:Ha Cluster 3p



    Can you have one host running vSphere ESXi 4.0 and two hostst running vSphere ESXi 4.1 in the same HA cluster? Or does every host need to be running the same version of vSphere?

  • RELEVANCY SCORE 2.91

    DB:2.91:Vsphere 4 And Esxi 4 Ha Errors mc



    I am having trouble enabling HA on the new ESXi 4 and vSphere 4 environment on my IBM BaldeCenter S environment. Everything was working perfect untill i activated DPM on aggressive mode. DPM powered down all hosts except one ruuning all the VM's. Now since then the HA agent is not configuring properly.

    I receive the following errors when i try to enable HA on the cluster.

    HA agent has an error : cmd addnode failed for

    secondary node: Internal AAM Error - agent could

    not start. : Unknown HA error

    error

    5/27/2009 1:59:51 PM

    Administrator

    HA agent has an error : Cannot complete the HA

    configuration

    error

    5/27/2009 1:59:51 PM

    Administrator

    I have tried everything from disabling and re-enabling HA, Creating a new cluste and DataCenter, entering hosts in maintainance mode and then exiting, disconecting the hosts and removing then re-adding, installing frest ESXi and vCenter servers, changing network parameters, reconfiguring for HA.

    I have manually edited the host file and entered the entries for all esx hosts. All esx hosts and vCenter can ping each other through IP, FQDN and short names.

    Any help would be highly appretiated. I am totally out of options.

    Best regards,

    Adeel Akram

    DB:2.91:Vsphere 4 And Esxi 4 Ha Errors mc

    I'd just like to add my own experience which produced similar errors....

    Two ESX 4.0.0 208167 hosts. One dropped out of HA cluster with the error "HA agent has an error: Cannot complete the HA configuration". The host in question became unresponsive at the console (i.e. KVM) and over the network. I had a look around the communities and found the "TroubleshootingVMware High Availability (HA)" page:

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_UScmd=displayKCexternalId=1001596

    Point 5, relating to date/time and NTP, seemed to be at fault on my system. Although it was not out by more than a minute. This is enough to throw the clustering out. I only had one NTP server configured (this is a lab) - I would certainly recommend using the public pools or at least 3 NTP servers to ensure better time keeping.

    - William

  • RELEVANCY SCORE 2.91

    DB:2.91:Check Host Agent Version And Reinstall 8z


    How can I check the version of a host agent ?

    It seems that we have one problematic host in our DRS / HA cluster and I seem to remember we had some problems when we upgraded the Virtual Center to Update 1.

    I am wondering now whether the upgrade of the host agent actually failed on that host. If so, where can I get the agent and how do I reinstall it without downtime ?

    DB:2.91:Check Host Agent Version And Reinstall 8z


    Got an answer from support (hope it is ok to post it here):

    Verifying that the correct version of VirtualCenter is installed.

    To verify that the correct version of VirtualCenter is installed:

    1 Right-click on the server and click Disconnect to disconnect the ESX Server in VMware VirtualCenter.

    2 Open a Secure Shell (Putty) connection to the ESX Server and log on as root.

    3 Issue the following command to determine the version of VMware VirtualCenter agent (vpxa) installed.

    # rpm -qa | grep vpxa

    VMware-vpxa-2.5.0-64192

    Ensure that the version and build number correspond to your VMware VirtualCenter installation. From VMware VirtualCenter, click HelpAbout.

    Reinstalling the agents

    1 Disconnect the ESX Server in VMware Virtual Center by Right Clicking the Server and select Disconnect.

    2 Open a Secure Shell (Putty) connection to the ESX Server and log on as root.

    3 Issue the following two commands in order to get the names of the packages that need to be removed.

    # rpm -qa | grep vpxa

    VMware-vpxa-2.5.0-64192

    # rpm -qa | grep aam

    Note: The aam packages may not be installed.

    4 Remove the three packages by issuing the following command.

    # rpm -e

    5 Connect the ESX Server in VMware VirtualCenter again. VMware VirtualCenter installs the required packages.

  • RELEVANCY SCORE 2.91

    DB:2.91:Host Memory Minimum For Ha? jc



    The memory minimum for ESXi should be 2 GB of RAM, but I wonder if that is enough for running the HA Agent? On two test ESXi with 2 GB of RAM when trying to setup HA it failed and something from the error messages made me suspect lack of RAM. After adding up to 2.5 GB of memory HA went through.

    So my first question is: any minumum amount of RAM on a host for HA to work?

    On the same cluster with these two small ESXi servers I noticed that the slot size was about the expected, something like 256Mhz and 360 MB of RAM. HA setting = tolerate 1 node and small VM running only. What was quite strange to me was the the Advanced Runtime Info reported only 1 single slot in the whole cluster and when doing a HA test with power off one host the VM did not fail over and HA reported "not enough resources to failover VM" or similar. Why could this be happening? Does HA expects more RAM to be available?

    DB:2.91:Host Memory Minimum For Ha? jc


    Chris Wahl wrote:

    HA requires 2300 MB of memory to enable. If you have less than this amount, you will get an error when trying to enable HA.

    Great, I have not seen that number before. (And of course it is kind of unusual with such small servers these days, but the minumum is still 2 GB)

  • RELEVANCY SCORE 2.91

    DB:2.91:Ha Agent On The Host Has Been Shut Down aa


    The other day I had an ESXi host loose power. After the host came back up, it had an error "HA agent disabled on on host in cluster".

    I have had this happen before after rebooting a host. What usually fixes it for me is to take the host into maintenance mode then take it back out. I did that and it still has the error. So I ran the option "Reconfigure for VMWare HA" but it gives an error 'HA agent on lt; hostnamegt; in cluster lt;Cluster namegt; has an error "HA agent on the host has been shut down"'.

    I rebooted the host again and tried the steps above but I am unable to get the HA agent to start on the host. Any suggestions?

    As an FYI, this host was a working member of my HA cluster and when the host lost power, the VMs all came up on other members of the HA cluster as they should have.

  • RELEVANCY SCORE 2.91

    DB:2.91:Vsphere Ha State Stuck In "Election" 83



    Hello,

    I am currently playing with several test environments , and I have the following case :

    - 1 vCenter 5.0 913577 that has 1 cluster with 2 ESXi hosts 5.0 1117897 , HA enabled and 3 powered on virtual machines

    Now , showing someone what HA does , I powered off the Slave Host ( from the power button ) , everything went ok , the VMs were restarted on the Master ESXi host .

    After I powered on the Slave , moved 1 machine on the Slave and 2 remained on the Master . Everything was looking good, I tried the same test with the Master host ( powered it off from the button ) , and now the fun begins :

    - I can see in Summary of the Slave host : vSphere HA state : Election and the error on the cluster : Cannot find vSphere HA master agent . The VMs that were running on the Master host are still down.

    Shouldn't the Slave host take over the Master function ( as there is no other ESXi host ) ? Am I missing something, or ?

    Please let me know if you need additional details or something .

    Thank you in advance.

    DB:2.91:Vsphere Ha State Stuck In "Election" 83


    Well, you that's exactly it: "Lessen learned". Isn't it actually one of the reasons you have a lab? Break and fix things to learn and understand how they work.

    Andr

  • RELEVANCY SCORE 2.90

    DB:2.90:Ha Agent On Esx-Host In Cluster Cluster-Name Has An Error k8



    I have a cluster in Vcenter 2.5, 3 ESX 3.5 host. The ha cluster worked fine, 1 host is OK and 2 hosts have red alert "HA Agent on ESX-HOST in cluster CLUSTER-NAME has an error ".

    I disable and then enable HA, still the same error.

    An error in event "HA Agent on ESX-HOST in cluster CLUSTER-NAME has an error : /opt/vmware/aam/bin/ft_startup "

    Can someone tell me how to fix it?

    DB:2.90:Ha Agent On Esx-Host In Cluster Cluster-Name Has An Error k8

    This is not DNS problem, I have double checked the DNS resolution from both sides.

  • RELEVANCY SCORE 2.90

    DB:2.90:Disable Ha Agent On A Esx Server ak



    Hi, i tried to setup an cluster with two esx host. I had some problem and i removed the cluster. But on the summary on the host i can si det messages: Ha agent ont the xxxxx in cluster xxxxx in yyyy has an error. How do i remove this messages/how do i disable the HA agent.

    DB:2.90:Disable Ha Agent On A Esx Server ak


    Hi, i tried to setup an cluster with two esx host. I had some problem and i removed the cluster. But on the summary on the host i can si det messages: Ha agent ont the xxxxx in cluster xxxxx in yyyy has an error. How do i remove this messages/how do i disable the HA agent.