Archive for the 'System Center' Category

SCVMM and P2V Adventures

Where I work, we have been using Microsoft Virtualization since Virtual Server was in Beta.  Of course, we don’t necessarily use all of the functions and features of all the software we have, but one feature that I have used a good bit is the “Convert physical server” action in System Center Virtual Machine Manager.  Until recently, I have used this with great success.  We run IBM xSeries servers and I have converted something like 50 of them to virtual machines running on Hyper-V over the past several years. 

In late 2007, we bought our first IBM Blade Center (which I am very happy with) and with that move we also decided to do “boot from SAN” for all of our blades.  Just seemed to make sense that we wouldn’t put moving parts in a device that was designed to run so well without moving parts. 

At the time, we were implementing a new ERP system and several “hanger on” type applications, and Hyper-V (virtualization in general) wasn’t something that was supported by a lot of the software we were deploying.  So we have a lot of powerful blade servers, running a lot of low use applications.  I have managed to eradicate several of those wasteful installations, but there are a set that I am only now getting buy-in to virtualize. 

And today’s adventure begins with a Windows Server 2003 SP2 machine installed Boot from SAN on an IBM HS21-XM Blade server.

First attempt:

1.  Convert physical server

2.  Virtual machine name

3.  Scan System

image

Looks good..

4. Conversion options

image

we can try the defaults..

5.  Specify the processor and memory… 

6.  Select the host, path, network, start options, etc..

7.  The job starts, the machine gets copied over, and …

That try resulted in a blue screen loop.. 

image

Ok… time to try the Offline conversion:

1. Proceed as above but select the Offline conversion option at step 4.

2.  hmm..  conversion warnings… must correct to proceed..

Warning (13246)
No compatible drivers were identified for the device: Broadcom BCM5708S NetXtreme II GigE (NDIS VBD Client). The offline physical-to-virtual conversion requires a driver for this device.

Device Type: network adapter
Device Description: Broadcom BCM5708S NetXtreme II GigE (NDIS VBD Client)
Device Manufacturer: Broadcom Corporation
Hardware IDs (listed in order of preference):
B06BDRV\L2ND&PCI_16AC14E4&SUBSYS_03271014&REV_12

Compatible IDs (listed in order of preference):
B06BDRV\L2ND&PCI_16AC14E4&SUBSYS_03271014
B06BDRV\L2ND&PCI_16AC14E4
B06BDRV\L2ND

Recommended Action
Create a new folder under C:\Program Files\Microsoft System Center Virtual Machine Manager 2008 R2\Driver Import on the Virtual Machine Manager server and then copy the necessary 32-bit Windows Vista driver package files for this device to the new folder. The driver package files include the driver (.sys) and installation (.inf and .cat) files. Check the device manufacturer’s website for the necessary drivers.

We don’t really need to do that right…

Had some trouble with that part…  finally figured out that the drivers that need to be placed in that folder are the “RIS” drivers. 

Try number 3 (or 30, I lost count)…

1. Proceed as try number 2, ignore warning because we did put the driver in there, and

Blue screen loop…

Hmm… maybe this is just not meant to be.  Did some more searching and found this article:

http://blogs.msdn.com/b/robertvi/archive/2009/10/07/after-installing-hyper-v-integration-services-on-the-next-reboot-the-vm-displays-bsod-0x0000007b.aspx 

Basically, there are some people seeing the exact same blue screen that I was seeing, except this was after the install of updated integration components.  But I wasn’t installing integration components yet… or was I?

image

Ok so maybe it was getting that far and just “blowing up” after the install of the components.  Good thing about this being a P2V, I can go back to the source machine pretty easy and check the registry:

image

Looks like we may have an answer here.  Change the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Wdf01000\Group entry to be WdfLoadGroup instead of base. 

It is my guess, that this would have worked even with the online conversion option.

Influencers Blog

So the System Center guys have provided a place for people who work with System Center products to see a conglomeration of posts from various professionals who have registered to Blog about System Center products.  How fun…

Blog Posts by System Center Influencers


Get the feed.

Below are the most recent posts from several of the members of the System Center Influencers Program. Note that Microsoft does not review the content or endorse it in any way; we present this content in a feed form for your information and convenience. (In the event that the feed refuses to render due to the flakiness of the third-party feed service, simply use the feed embedded in the RSS icon above.)

Nexus SC: The System Center Team Blog : Blog Posts by System Center Influencers

Error installing DPM 2010 Beta

I was installing the DPM 2010 Beta (finally) and had an issue trying to get the SQL 2008 to install.  Finally figured out that I had the install files stored too deeply in a network share.  I figure this out by running the SQL install directly and when it when to check prereq’s it had an error on one section and when you click for more info this is what you get:

Rule "Long path names to files on SQL Server installation media" failed.

SQL Server installation media on a network share or in a custom folder can cause installation failure if the total length of the path exceeds 260 characters. To correct this issue, utilize Net Use functionality or shorten the path name to the SQL Server setup.exe file.

So, I moved it to a shorter path and it installed just fine.

Data Protection Manager 2010

So I am a bit late realizing this, but the Beta for DPM 2010 is available now on the Connect site.  I haven’t read anything on it yet, so mainly I am just posting this to make myself look into it.

https://connect.microsoft.com/Downloads/DownloadDetails.aspx?SiteID=840&DownloadID=22070

Follow up to the DPM recovery point expiration issues

Previously, I blogged about issues I was having where old recovery points were not being expired/removed from my DPM servers.  I had to open a ticket with Microsoft, and worked with them to determine the cause, and since then, they have released a fix.

The fix that Microsoft developed is here: http://www.microsoft.com/downloads/details.aspx?FamilyID=aee949aa-d3e7-4b0f-b718-00b7c20f1257&displayLang=en

A few people have asked for the PowerShell script “show-pruneshadowcopies.ps1” that Microsoft provided and I mentioned in my previous post (here).  The script looks like this:

#displays all RP for data sources and shows which RP’s would be deleted by the regular pruneshadowcopies.ps1
# Outputs to a logfile:  C:\Program Files\Microsoft DPM\DPM\bin\SHOW-PRUNESHADOWCOPIES.LOG

#Author    : Mike J
#Date    : 02/24/2009
$version="V1.0"

$date=get-date
$logfile="SHOW-PRUNESHADOWCOPIES.LOG.txt"

function GetDistinctDays([Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.ProtectionGroup] $group,
[Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.Datasource] $ds)
{   
    if($group.ProtectionType -eq [Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.OMCommon.ProtectionType]::DiskToTape)
    {
        return 0
    }
    $scheduleList = get-policyschedule -ProtectionGroup $group -ShortTerm
    if($ds -is [Microsoft.Internal.EnterpriseStorage.Dls.UI.ObjectModel.FileSystem.FsDataSource])
    {
        $jobType = [Microsoft.Internal.EnterpriseStorage.Dls.Intent.JobTypeType]::ShadowCopy
    }
    else
    {
        $jobType = [Microsoft.Internal.EnterpriseStorage.Dls.Intent.JobTypeType]::FullReplicationForApplication
        if($ds.ProtectionType -eq [Microsoft.Internal.EnterpriseStorage.Dls.Intent.ReplicaProtectionType]::ProtectFromDPM)
        {           
            return 2
        }
    }
    write-host   "Look for jobType $jobType"

    foreach($schedule in $scheduleList)
    {
        write-host("schedule jobType {0}" -f $schedule.JobType)
        if($schedule.JobType -eq $jobType)
        {
            return [Math]::Ceiling(($schedule.WeekDays.Length * $ds.RecoveryRangeinDays) / 7)
        }
    }

    return 0
}

function IsShadowCopyExternal($id)
{
    $result = $false;

    $ctx = New-Object -Typename Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlContext
    $ctx.Open()

    $cmd = $ctx.CreateCommand()
    $cmd.CommandText = "select COUNT(*) from tbl_RM_ShadowCopy where shadowcopyid = ‘$id’"  
    write-host $cmd.CommandText
    $countObj = $cmd.ExecuteScalar()
    write-host $countObj
    if ($countObj -eq 0)
    {
        $result = $true
    }
    $cmd.Dispose()
    $ctx.Close()

    return $result
}

function IsShadowCopyInUse($id)
{
    $result = $true;

    $ctx = New-Object -Typename Microsoft.Internal.EnterpriseStorage.Dls.DB.SqlContext
    $ctx.Open()

    $cmd = $ctx.CreateCommand()
    $cmd.CommandText = "select ArchiveTaskId, RecoveryJobId from tbl_RM_ShadowCopy where ShadowCopyId = ‘$id’"  
    write-host $cmd.CommandText
    $reader = $cmd.ExecuteReader()
    while($reader.Read())
    {
        if ($reader.IsDBNull(0) -and $reader.IsDBNull(1))
        {
            $result = $false
        }
    }
    $cmd.Dispose()
    $ctx.Close()

    return $result
}

"**********************************" > $logfile
"Version $version" >> $logfile
get-date >> $logfile

$dpmservername = &"hostname"

$dpmsrv = connect-dpmserver $dpmservername

if (!$dpmsrv)
{
    write-host "Unable to connect to $dpmservername"
    exit 1
}

write-host $dpmservername
"Selected DPM server = $DPMservername" >> $logfile
$pgList = get-protectiongroup $dpmservername
if (!$pgList)
{
    write-host   "No PGs found"
    disconnect-dpmserver $dpmservername
    exit 2
}

write-host("Number of ProtectionGroups = {0}" -f $pgList.Length)
$replicaList = @{}
$latestScDateList = @{}

foreach($pg in $pgList)
{
    $dslist = get-datasource $pg
    if ($dslist.length -gt 0)
    {
    write-host("Number of datasources in this PG = {0}" -f $dslist.length)
    ("Number of datasources in this PG = {0}" -f $dslist.length) >> $logfile
     }
    Foreach ($ds in $dslist)
    {
       write-host("DS NAME=  $ds")
       ("DS NAME=  $ds") >>$logfile
    }
    foreach ($ds in $dslist)
    {       
        $rplist = get-recoverypoint $ds | where { $_.DataLocation -eq ‘Disk’ }
        write-host("Number of recovery points for $ds {0}" -f $rplist.length)
        ("Number of recovery points for $ds {0}" -f $rplist.length) >>$logfile 
        $countDistinctDays = GetDistinctDays $pg $ds
        write-host("Number of days with fulls = $countDistinctDays")
        ("Number of days with fulls = $countDistinctDays") >>$logfile
        if($countDistinctDays -eq 0)
        {
            write-host   "D2T PG. No recovery points to delete"
            "D2T PG. No recovery points to delete" >>$logfile
            continue;
        }
        $replicaList[$ds.ReplicaPath] = $ds.RecoveryRangeinDays
        $latestScDateList[$ds.ReplicaPath] = new-object DateTime 0,0
        $lastDayOfRetentionRange = ([DateTime]::UtcNow).AddDays($ds.RecoveryRangeinDays * -1);       
        write-host("Distinct days to count = {0}. LastDayOfRetentionRange = {1} " -f $countDistinctDays, $lastDayOfRetentionRange)
        ("Distinct days to count = {0}. LastDayOfRetentionRange = {1} " -f $countDistinctDays, $lastDayOfRetentionRange) >>$logfile
        $distinctDays = 0;
        $lastDistinctDay = (get-Date).Date
        $numberOfRecoveryPointsDeleted = 0

        if ($rplist)
        {
            foreach ($rp in ($rplist | sort-object -property UtcRepresentedPointInTime -descending))
            {                       
                if ($rp)
                {                   
                    if ($rp.UtcRepresentedPointInTime.Date -lt $lastDistinctDay)
                    {
                        $distinctDays += 1
                        $lastDistinctDay = $rp.UtcRepresentedPointInTime.Date
                    }
                    write-host(" $ds")
                    (" $ds") >>$logfile
                    write-host("  Recovery Point #$distinctdays RPtime={0}" -f $rp.UtcRepresentedPointInTime)
                    ("  Recovery Point #$distinctdays RPtime={0}" -f $rp.UtcRepresentedPointInTime) >>$logfile
                    if (($distinctDays -gt $countDistinctDays) -and ($rp.UtcRepresentedPointInTime -lt $lastDayOfRetentionRange))
                    {
                        write-host ("Recovery Point would be deleted ! – RPtime={0}" -f $rp.UtcRepresentedPointInTime)  -foregroundcolor red
                        ("Recovery Point would be deleted ! – RPtime={0} <<<<<<<" -f $rp.UtcRepresentedPointInTime) >>$logfile
#remove-recoverypoint $rp -ForceDeletion -confirm:$true | out-null
                        $numberOfRecoveryPointsDeleted += 1
                    }
                    else
                    {
                        write-host "    Recovery point not expired yet"
                        "    Recovery point not yet expired" >>$logfile
                    }
                }
                else
                {
                    write-host "Got a NULL rp"
                    "Got a NULL rp" >>$logfile
                }   
            }

            write-host "Number of RPs that would be deleted = $numberOfRecoveryPointsDeleted"  
            "Number of RPs that would be deleted = $numberOfRecoveryPointsDeleted" >>$logfile            
        }
    }
}

disconnect-dpmserver $dpmservername
write-host "Exiting from script"

exit

DPM v 3

I just watched a webcast on DPM v3 and thought I would share some of what I got from that.

In the last 18 months, DPM 2007 (v2) delivered application protection for Exchange ,SQL Server, SharePoint and virtualization environments running Virtual Server and Hyper-V.  Disaster recovery with Iron Mountain, Local Datasource Protection and Client backups have also come out through DPM 2007, its first feature update and Service Pack 1.  Now it is time to show what is coming next for DPM. 

A few top line items are support for the following:

  • support for Exchange 14, and more granular restore
  • protect the entire SQL instance, and auto discover new DB’s
  • protect 1000’s of DBs per DPM server
  • End User Recovery by the SQL Admin (role based access from the DPM console)
  • Office 14
  • AD appears as a data source in DPM UI
  • Image restore from centrally managed DPM server – executed locally
  • Support for Windows guest on VMware hosts
  • SAP running on MS SQL

and some other improvements:

  • up to 100 servers, 1000 laptops, 2000 databases per DPM server
  • management pack updates
  • automatic re-running of jobs and improved self-healing  —  This is a huge one in my book
  • auto protect new sources for SQL and MOSS
  • improved scheduling capabilities
  • one click DPM DR failover and failback
  • continued support for SAN (scripts/whitepapers)

platform requirements:

  • DPM Server must be 64-bit Windows Server 2008 R2
  • Integration capability with Windows EBS 2008 R2

P2V fails at Copy Hard Disk

I have been trying to get a P2V of a production system to use in our DR plan.  I have limited opportunity to do this, because I am not allowed to impact performance during production hours for this system, and the definition of production hours is fairly broad.  I have been trying for a couple of months to get this figured out.

We have our regularly scheduled maintenance once a month on the third Thursday of the month.  This is pretty awesome in that we are at liberty (most months) to take everything down from 6PM until 6AM.  I look at it as giving the company an evening off. :)

So, that being tonight, I had it in my mind that I was going to beat the OAS boxes.  (Oracle Application Servers, part of our new JD Edwards ERP system.)  They are an interesting setup, because they are using Apache, which as great as it may be, isn’t something I have much experience with.  They have a loopback adapter for use with the load balancing setup that they are in.  The load balancing is performed using our Cisco switches, which as great as they are, I don’t know very much about.  All in all, they are pretty complicated to troubleshoot in this case, because there are so many pieces that I am not completely familiar with. 

Such is life…

Anyway,  after a lot of hunting and a lot of posting in forums, I found an event that actually led to a solution. I probably should have found this before, and maybe I did, but didn’t pay enough attention… 

This is the exact symptoms that I had, and the errors in the event log were there, but the machine that I am trying to convert is a Windows 2003 Server, not Windows XP:

The P2V process fails at 40% when you try to run the P2V process by using Microsoft System Center Virtual Machine Manager 2008 on a source computer that is running Windows XP

You use Microsoft System Center Virtual Machine Manager 2008 to run the Physical-to-Virtual (P2V) process on a source computer that is running Windows XP. However, the process fails at 40% complete, and the following error is logged in the event log on the computer that has System Center Virtual Machine Manager (SCVMM) 2008 installed:

Type:		Warning
Date:		<Date>
Time:		<Time>
Event:		1706
Source:		Virtual Machine Manager
Category:	None
Computer:	<Computer Name>
Event Msg:	Job 7bfcd14a-884e-4a71-9984-3274622adeb7 (Physical-to-virtual conversion) failed to complete. 7bfcd14a-884e-4a71-9984-3274622adeb7 Physical-to-virtual conversion TaskFailed    

Additionally, you will find the following error logged in the event log on the source computer:

Type:		Error
Date:		<Date>
Time:		<Time>
Event:		15005
Source:		HTTP
Category:	None
Computer:	<Computer Name>
Event Msg:	Unable to bind to the underlying transport for 0.0.0.0:443. The IP Listen-Only list may contain a reference to an interface which may not exist on this machine.  The data field contains the error number.
Data:
 00 00 04 00 02 00 52 00 00 00 00 00 9D 3A 00 C0		 . . . . . . R . . . . . . . . À
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00		 . . . . . . . . . . . . . . . .
 00 00 00 00 00 00 00 00 43 00 00 C0				 . . . . . . . . C . . À

The P2V process fails at 40% when you try to run the P2V process by using Microsoft System Center Virtual Machine Manager 2008 on a source computer that is running Windows XP

DPM does not remove expired recovery points

I have been using DPM for about 7 months now.  (I tested with it for a few months before that.)  I never installed 2006, but 2007 seems to be working ok.  I have a few complaints, but I have complaints about all the backup software that I have ever used.  None of it really makes me happy.  But on to the story…

I have 3 production DPM servers.  One of them has a large number of protection group members.  8 Protection Groups, 328 Members.  And that is just to protect 39 computers, but one of the SQL servers has about 150 databases.

I noticed the problem because I kept running out of space on the Recovery Point volumes.  I had a particular 2008 Domain Controller that the system state recovery point volume would have to be extended every couple of days.  I was keeping the recovery points on disk for 5 days, so it finally occurred to me that it should take more that 200 GB to keep 5 days work of recovery points for the system state. 

I called and opened a ticket with Microsoft and we have been working on this for almost 2 months.  So far, the best that I can tell is that the process that clears the old recovery points slowly eats up memory.  This coupled with the fact that I have a lot of PG members, and means that the job frequently fails before it completes.  If the number of recovery points continues to grow, the job that clears them (pruneshadowcopies) takes longer and takes more memory.  This increases the chance that it will fail…

I don’t have a solution to this problem yet, other than a few work-arounds and a way to manually run the process:

  • add more RAM to your DPM Server.  Especially if you are running SQL locally on the box.
  • reduce the number of PG members.  Fewer members, less recovery points, less chance the prune job will fail.
  • open the DPM Management Shell (DPM PowerShell) and run “pruneshadowcopies.ps1”.  This will manually run the job that is triggered by DPM at midnight every night.  If you have a lot of recovery points that haven’t been pruned, then this will probably fail (crash) a few times before it finishes.  I have had it run all weekend before and then crash, and I have seen it run for just an hour and then crash.  Keep running it, and it will eventually finish. 
  • Hope that Microsoft comes up with a real fix soon…

To see if you have this problem, there is a version of the pruneshadowcopies script that just shows the recovery points, without actually expiring them.  The tech that I have been working with on my case sent it to me. 

Remove DPM agent from the DPM agent console

I blogged about this last year, but when I moved my blog, I lost part of the post (the picture) so I just deleted the post.  Then I noticed that Google is still sending people here to find the answer, so…

If you have DPM Protected Computer that goes away before you uninstall the agent, it isn’t obvious how you get the agent removed from the console.  Or at least it wasn’t immediately obvious to me.

  1. In the Management/Agents tab, right click on the agent (it will have a red x and “Unavailable” in the Agent Status column) and select Uninstall…
  2. Verify your list of agents (you can select more than one)
  3. Click on “Uninstall Agents”
  4. Enter the appropriate credentials.  This must be an account that has permissions to remove the agent from the DPM server, even though the Protected computer doesn’t exist, it still has to be a valid account.
  5. Select the “Manually restart the selected servers later” radio button
  6. Click ok.

So far, that isn’t any different than any other client uninstall.   At this point, you will have the option to close the window, and go on about your business.  And if the protected computer was still available, that would be perfectly fine to do.  But since the protected computer isn’t still available, you have to wait for the error to pop up.  First you will see that the uninstall failed and then you get this message:

image

Basically, it says, I couldn’t find that computer to remove the agent, you want me to just forget that it existed?  You click on “Yes” and then the entry for that computer is removed from the DPM database.  Now wasn’t that obvious?

New Features in System Center Virtual Machine Manager 2008 R2

Of the new features coming in the R2 versions of Windows Server 2008 and SCVMM, I think these two are the obvious winners:

Support for Live Migration: With Windows 2008 R2 adding support for Live migration, it’s now added as a new migration option in VMM R2. Live migration requires the source and destination host to be part of a failover cluster and that the VM is on a shared storage. Live migration means that there is no user perceived downtime; since the VM’s memory pages are being transferred, the hosts’ processors need to be the same (manufacturer and processor architecture). Our competition claims that Vmotion doesn’t require clustering but this only works for planned downtime and not for unplanned downtime. By combining Live migration and clustering, Hyper-V addresses both planned and unplanned downtime.

Multiple VMs per LUN: VMM 2008 didn’t allow placing multiple VMs per LUN even though Hyper-V allowed it and the reason was that the LUN ownership was on a per host basis. This meant that migrating any VM on that shared LUN would result in all other VMs being migrated as well which can result in a confusing user experience (I’ve blogged about this at length). With CSV (Clustered Shared Volumes) in Windows 2008 R2, a single LUN is accessible by all hosts within a cluster. This enables a VM that’s on a shared LUN to be migrated without affecting other VMs on that LUN. As a result, with VMM R2, we’ll allow multiple VMs to be placed on the same LUN if CSV is enabled on the cluster.

http://blogs.technet.com/rakeshm/archive/2009/03/16/scvmm-2008-r2-beta-is-available-now.aspx

That is from the beta release announcement for SCVMM.  I have downloaded the beta, but haven’t had time lately to get it setup.  I am hoping to work on that this coming week…