Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resurrecting Azure Execution provider #1121

Merged
merged 25 commits into from Jul 17, 2019
Merged

Conversation

@benhg
Copy link
Member

benhg commented Jul 7, 2019

This pull request re-adds the Azure execution provider.

Maximum number of blocks to maintain. Default is 10.
region : str
Azure region to launch machines. Default is 'westus'.
key_name : str

This comment has been minimized.

Copy link
@benclifford

benclifford Jul 8, 2019

Contributor

does this get used anywhere?

This comment has been minimized.

Copy link
@benhg

benhg Jul 8, 2019

Author Member

Azure requires an "admin password" to be set for Linux VM instances (which I really don't like) but optionally allows for SSH keys to be used as well. I am open to opinions on whether it would be useful to have keys implemented, knowing that passwords are required anyways.

This comment has been minimized.

Copy link
@benclifford

benclifford Jul 8, 2019

Contributor

I'm a big fan of SSH keys, so I'm not going to say no.

@benclifford

This comment has been minimized.

Copy link
Contributor

benclifford commented Jul 8, 2019

I've made a bunch of comments - I think they're all usability related rather than anything to do with functionality.

I'm not set up for azure so I can't run this code so I won't click approve.

@benhg

This comment has been minimized.

Copy link
Member Author

benhg commented Jul 8, 2019

I have asked questions on a couple of them - will fix the rest later today. Thanks for the review !

benhg added 8 commits Jul 9, 2019
…into benhg-azure-provider
test-requirements.txt Outdated Show resolved Hide resolved
@benclifford

This comment has been minimized.

Copy link
Contributor

benclifford commented Jul 11, 2019

I have fiddled with azure enough to have this start up a VM for me, using the below configuration. However it doesn't seem to successfully start up the htex workers - from dmesg on the started VM, it looks like maybe I'm running out of memory so I'll try a larger one.

Killing parsl with ctrl-C doesn't kill the VMs, which is worrying as that leaves a drain on real money behind at the end of the run.

config = Config(
    executors=[
        HighThroughputExecutor(
            label='azure_htex_single_node',
            address="fitzroy.cqx.ltd.uk",
            provider=AzureProvider(
                vm_reference=
                  {
                    'publisher': "Canonical",
                    'offer': "UbuntuServer",
                    'sku': "19.04",
                    'version': "latest",

                    'vm_size': "Standard_B1ls",

                    'disk_size_gb': 10,
                    "admin_username": "benc",
                    "password": "ligo123"
                },

                key_name='aws-us-east1',
                key_file='./azurecreds.pem',
                init_blocks=1,
                max_blocks=1,
            ),
        )
    ],
)
@benclifford

This comment has been minimized.

Copy link
Contributor

benclifford commented Jul 11, 2019

Using the below config I am able to run some simple test @python_apps:

config = Config(
    executors=[
        HighThroughputExecutor(
            label='azure_htex_single_node',
            address="fitzroy.cqx.ltd.uk",
            provider=AzureProvider(
                vm_reference=
                  {
                    'publisher': "Canonical",
                    'offer': "UbuntuServer",
                    'sku': "19.04",
                    'version': "latest",

                    'vm_size': "Standard_B4ms",

                    'disk_size_gb': 10,
                    "admin_username": "benc",
                    "password": "ligo123"
                },

                key_name='aws-us-east1',
                key_file='./azurecreds.pem',
                init_blocks=1,
                max_blocks=1,
            ),
        )
    ],
)
Required structure:
{
'publisher': VM OS publisher

This comment has been minimized.

Copy link
@benclifford

benclifford Jul 11, 2019

Contributor

I found it quite awkward to find a userful publisher/offer/sku/version combo - it involved a lot of messing round in the Azure Marketplace. I don't know if there is a better way. But make some note about where to get these values from.

@benhg

This comment has been minimized.

Copy link
Member Author

benhg commented Jul 11, 2019

Killing parsl with ctrl-C doesn't kill the VMs, which is worrying as that leaves a drain on real money behind at the end of the run.

Killing parsl with ctrl-C should send cancel() requests and thus should close all of the VMs. In my testing, this worked as it should. Perhaps your VMs running out of memory caused this to not occur. Did the VMs go away with the updated, working config?

With regards to the rest of these changes, I will make them this evening.

@benhg

This comment has been minimized.

Copy link
Member Author

benhg commented Jul 12, 2019

I've done some more testing on the issue of not deleting on ctrl-C. If a submit() call is not still running, the VMs will indeed delete themselves. Unfortunately, Azure's API for running scripts on the VM (required for connecting back to Parsl controller) is both synchronous and very slow (30 sec min), so if you ctrl-C within 30 sec or so of a submit() request, Parsl doesn't know that an instance has been started, and so can't clean it up on shutdown. Any ideas on how to deal with this?

@benclifford

This comment has been minimized.

Copy link
Contributor

benclifford commented Jul 12, 2019

There's enough information stored at this point, before the script is run, to be able to kill the VM, even though parsl doesn't know about it.

submit, line 255-256

        vm_info = async_vm_creation.result()
        self.instances.append(vm_info.name)

Once you have that info, maybe you could put a try block around the whole of the rest of the code in submit that deals with killing the VM if an exception occurs. I think you might be able to see a KeyboardInterrupt exception there when ctrl-C is pressed (I'm not sure).

The idea being that until submit returns, it is the responsibility of that try block to tidy up; and then as soon as it ends and passes the ID back to the rest of parsl, it is the responsibility of the rest of parsl to call cancel.

@benhg

This comment has been minimized.

Copy link
Member Author

benhg commented Jul 12, 2019

There's enough information stored at this point, before the script is run, to be able to kill the VM, even though parsl doesn't know about it.

submit, line 255-256

        vm_info = async_vm_creation.result()
        self.instances.append(vm_info.name)

Once you have that info, maybe you could put a try block around the whole of the rest of the code in submit that deals with killing the VM if an exception occurs. I think you might be able to see a KeyboardInterrupt exception there when ctrl-C is pressed (I'm not sure).

The idea being that until submit returns, it is the responsibility of that try block to tidy up; and then as soon as it ends and passes the ID back to the rest of parsl, it is the responsibility of the rest of parsl to call cancel.

This makes sense to me. I will make those changes later today.

Done

benhg added 3 commits Jul 12, 2019
@benclifford

This comment has been minimized.

Copy link
Contributor

benclifford commented Jul 15, 2019

ok this is looking good. I will give it another try out probably tomorrow and hopefully will merge it then

benclifford added 2 commits Jul 16, 2019
@benclifford benclifford self-requested a review Jul 17, 2019
@benclifford benclifford merged commit 1aebf26 into Parsl:master Jul 17, 2019
1 check passed
1 check passed
Travis CI - Pull Request Build Passed
Details
@benclifford benclifford mentioned this pull request Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.