Manually configuring Data Science service on Oracle Cloud Infrastructure
1 Introduction
Learn how to get started configuring your tenancy for Data Science and test creating a notebook session.
This tutorial is directed at administrator users because they are granted the required access permissions.
In this tutorial, you will:
- Create a Data Scientists User Group
- Create a Compartment for Your Work
- Create a VCN and subnet
- Create policies
- Create a Dynamic Group and write policies for it
- Create a notebook session
2 Before You Begin
To perform this tutorial successfully, you must have the following:
- An OCI account with administrator privileges, see signing up for Oracle Cloud Infrastructure.
- At least one user in your tenancy who wants to access the Data Science service. This user must be created in IAM.
3 Creating a Data Scientists User Group
You must create a user group for the data scientists to work in.
-
Open a supported browser and enter the Console URL,
https://console.<_tenancy-region_>.oraclecloud.com
.The
<_tenancy-region_>
can beus-ashburn-1
,us-phoenix-1
, and so on. Use one of the Data Sciencesupported Regions and Availability Domains. - Enter your cloud tenant and click Continue.
- Sign in with your credentials.
-
Open the navigation menu and click Identity & Security. Under Identity, click Groups.
A list of the groups in your tenancy displays.
- Click Create Group.
-
Create a data-scientists group and enter a description:
-
Click Create.
You are advanced to the data-scientists group detail page that you created.
-
Click Add User to Group.
-
Select a user to add, and then click Add.
The selected user is added and appears in the group member list.
-
Repeat adding data scientist users until all of your users are added to the data-scientists group you just created:
A list of the users in your tenancy displays.
4 Creating a Compartment for Your Work
Next, you create a compartment for your data science resources.
-
Open the navigation menu and click Identity & Security. Under Identity, click Compartments.
- Click Create Compartment to create your compartment.
-
Name the new compartment data-science-work, and enter a description.
-
Click Create Compartment.
The data-science-work compartment is created, and added to the compartments list when it successfully creates.
5 Creating a VCN and Subnet.
You need to create a virtual cloud network (VCN) for use by the Data Science service.
Note: For a private subnet to have egress access to the internet, it must have a route to a NAT Gateway. For egress access to the public internet, we recommend that you use a private subnet with a route to a NAT Gateway. A NAT gateway gives instances in a private subnet access to the internet.
-
Open the navigation menu and click Networking. Under Core Infrastructure, click . Click Virtual Cloud Networks.
- Select the compartment that you want to create the VCN in.
- Click Start VCN Wizard.
-
Make sure that VCN with Internet Connectivity is selected, and then click Start VCN Wizard.
- Enter datascience-vcn for the VCN Name.
- Select the data-science-work compartment. This compartment contains the VCN you are creating. It takes time for this new compartment to be populated in the drop-down list, so refresh the page until it appears.
- Click Next.
-
Use the Configure VCN and Subnets defaults as follows:
- Make sure that Use DNS Hostnames in this VCN is selected.
-
Click Next.
A review of the VCN configuration is displayed.
-
Click Create to create the VCN and the related resources (three public subnets and an internet gateway).
Use this VCN and its private subnet when you create your notebook session.
- Click View Virtual Cloud Network to review your VCN and subnets.
6 Creating Policies
Before you can launch a notebook session, you have to configure the Data Science policies.
- Open the navigation menu and click Identity & Security. Under Identity, click Policies.
- Click Create Policy.
- Enter data-science-policy for the Name.
- Enter Policy for data science users and service as the Description.
- Select the data-science-work compartment.
-
Click Show manual editor.
-
Enter these three simple policy statements into the Policy Builder field:
To allow users in the data scientists group to perform all operations on projects, notebook sessions, models, and work requests that are found in the data-science-work compartment:
allow group data-scientists to manage data-science-family in compartment data-science-work
To allow those data scientists to use the VCN you just created and attach it to their notebook session:
allow group data-scientists to use virtual-network-family in compartment data-science-work
To allow the Data Science service to attach that VCN to your notebook session and route egress traffic from the notebook environment:
allow service datascience to use virtual-network-family in compartment data-science-work
- Click Create to create your policy.
7 Creating a Dynamic Group and Writing Policies for It
To enable notebook sessions to access other OCI resources, such as Object Storage or model catalog, you have to create a dynamic group and write policies for the notebook sessions’ resource principals.
- Open the navigation menu and click Identity & Security. Under Identity, click Compartments.
-
Click the data-science-work compartment.
The compartment details page is displayed.
- Click Copy to save the entire OCID to your clipboard.
- Click Compartments to return to the list of compartments.
- Click Dynamic Groups.
- Click Create Dynamic Group.
- Enter the following:
- Name: data-science-dynamic-group
- Description: Data Science dynamic group
-
Enter this matching rule. Replace
_<compartment-ocid>_
with the compartment OCID you copied.ALL {resource.type = 'datasciencenotebooksession', resource.compartment.id = '_<compartment-ocid>_'}
This matching rule means that all notebook sessions created in your compartment are added to data-science-dynamic-group.
-
Click Create.
Next, write a policy to enable access for this dynamic group.
-
Click Policies.
- Click Create Policy.
-
Enter the following:
- Name: data-science-dynamic-group-policy
- Description: Policy for the Data Science dynamic group
- Select the data-science-work compartment.
- Click Show manual editor,
-
Enter these policy statements into the Policy Builder field:
To allow the notebook sessions to perform CRUD operations on entries in the model catalog, projects, and notebook session resources:
allow dynamic-group data-science-dynamic-group to manage data-science-family in compartment data-science-work
To allow notebook sessions to perform CRUD operations on Data Flow applications and runs:
allow dynamic-group data-science-dynamic-group to manage dataflow-family in compartment data-science-work
To allow notebook sessions to list and read compartments and user names that are in the tenancy:
allow dynamic-group data-science-dynamic-group to read compartments in tenancy allow dynamic-group data-science-dynamic-group to read users in tenancy
To allow notebook sessions to read and write files to object storage buckets that are located in the data-science-work compartment:
allow dynamic-group data-science-dynamic-group to manage object-family in compartment data-science-work
-
Click Create to create the policy.
You can use this dynamic group with resource principals in notebook sessions.
8 Creating a Notebook Session
Lastly, you need to create a notebook session then test its access to the public internet.
-
Open the navigation menu and click Analytics & AI. Under Machine Learning, click Data Science.
- Click Create Project.
- Select the data-science-work compartment.
- (Optional) Enter Initial Project for the Name.
-
(Optional) Enter my first project for the Description.
- Click Create. The project details page appears.
- Click Create Notebook Session.
- Ensure that the data-science-work compartment is selected.
- (Optional) Enter my-first-notebook-session for the Name.
- Enter VM.Standard2.8 for the Instance Shape.
- Enter 100 for the Block Storage Size to attach to your virtual machine.
-
Select the datascience-vcn VCN and Private Subnet-data-science-vcn subnet to route egress traffic from your notebook session.
-
Click Create to launch your first notebook session.
You are advanced to the notebook sessions page. Creating the notebook session takes a few minutes. When it’s complete, the status turns to Active, and you can open the notebook session.
- Click Open.
-
Enter your Oracle Cloud Infrastructure credentials to access the JupyterLab UI.
- Click Terminal to perform a simple test to check that you can access the public internet from your notebook session.
-
Run this command:
wget --spider [https://www.oracle.com](https://www.oracle.com/)
You should see a response similar to:
The
HTTP request sent, awaiting response... 200 OK
indicates a successful test and you have public internet access in your notebook session.
9 What's Next
You are done with this simple tenancy setup.
Now, you can follow the remaining instructions in the getting-started.ipynb
notebook session to setup the following from your notebook environment:
-
OCI configuration file on the notebook environment.
-
Access the model catalog.
-
Access Object Storage.
-
Access Data Flow.
Using Notebook Sessions to Build and Train Models shows you how to continue with Data Science.