Tumgik
#GCP node setup
fcfvafeed · 16 days
Text
Running a uPlexa Node: Empowering Decentralization and Efficiency
What is a Node? Photo by Merlin Lightpainting on Pexels.com In the world of cryptocurrencies and blockchain technology, a node refers to a computer that participates in the network of a particular blockchain. Nodes are crucial components of a decentralized system, as they store, verify, and propagate the blockchain’s data. Every time a transaction is made or a new block is added, nodes across…
Tumblr media
View On WordPress
0 notes
codeonedigest · 5 months
Text
Master Google Cloud: Deploying Node JS APIs on VM
Full Video Link - https://youtu.be/gxZ-iJNCbAM Check out this new video on the CodeOneDigest YouTube channel! Learn how to create Virtual Machine in Google Cloud Platform, Setup Google Compute Engine VM & Deploy run JS APIs in VM. #codeonedigest
In this tutorial, we will create & setup Google Compute Engine Virtual Machine in Google Cloud Platform. We will be deploying & running javascript APIs in google compute engine virtual machine. We will be opening firewall port for incoming API request in VM. We will also learn how to deploy API code and run API service in google compute engine virtual machine. I will provide step by step guide to…
Tumblr media
View On WordPress
0 notes
prabhatdavian-blog · 8 days
Text
Master Ansible: Automation & DevOps with Real Projects
Introduction
In today's fast-paced IT world, automation is no longer a luxury; it's a necessity. One of the most powerful tools driving this revolution is Ansible. If you're looking to simplify complex tasks, reduce human error, and speed up your workflows, mastering Ansible is a must. This article will take you through Ansible’s role in DevOps and automation, providing practical insights and real-world examples to help you get the most out of it.
What is Ansible?
Ansible is an open-source tool that automates software provisioning, configuration management, and application deployment. Initially developed by Michael DeHaan in 2012, it has quickly risen to become a favorite among IT professionals.
The tool is known for its simplicity, as it doesn’t require agents to be installed on the machines it manages. Ansible operates through a simple YAML syntax, making it accessible even to beginners.
Why Ansible is Essential for Automation
Ansible’s automation capabilities are vast. It saves time by automating repetitive tasks, such as server configuration and software installations. By eliminating manual processes, it reduces the chance of human error. In short, Ansible gives teams more time to focus on high-priority work, enabling them to be more productive.
The Role of Ansible in DevOps
In a DevOps environment, where continuous integration and continuous deployment (CI/CD) pipelines are critical, Ansible plays a crucial role. It helps manage configurations, automate deployments, and orchestrate complex workflows across multiple systems. This ensures that your applications are delivered faster and with fewer issues.
Key Areas Where Ansible Shines in DevOps:
Configuration Management: Ensures consistency across servers.
Orchestration: Automates multi-tier rollouts.
Continuous Deployment: Simplifies application rollouts with zero downtime.
How Ansible Works
One of the most appealing aspects of Ansible is its agentless architecture. Unlike other automation tools, you don’t need to install agents on the systems Ansible manages. It uses SSH (Secure Shell) to communicate, making it lightweight and secure.
There are two main configuration models:
Push Model: Where Ansible pushes configurations to the nodes.
Pull Model: Common in other tools but not the default in Ansible.
Ansible Playbooks: The Heart of Automation
Playbooks are your go-to resource if you want to automate tasks with Ansible. Playbooks are files written in YAML that define a series of tasks to be executed. They are straightforward and readable, even for those with limited technical expertise.
Understanding Ansible Modules
Ansible comes with a wide range of modules, which are units of code that execute tasks like package management, user management, and networking. You can think of modules as the building blocks of your playbooks.
For example:
apt for managing packages on Ubuntu.
yum for managing packages on CentOS/RHEL.
file for managing files and directories.
Real-World Ansible Use Cases
Ansible isn’t just for small-scale automation. It’s used in enterprises around the world for various tasks. Some common use cases include:
Automating Cloud Infrastructure: Managing AWS, GCP, or Azure environments.
Managing Docker Containers: Automating container orchestration and updates.
Database Management: Automating tasks like backups, migrations, and configuration management.
Ansible vs. Other Automation Tools
Ansible often gets compared to other tools like Puppet, Chef, and Terraform. While each tool has its strengths, Ansible is popular due to its simplicity and agentless nature.
Ansible vs. Puppet: Puppet requires agents, while Ansible does not.
Ansible vs. Chef: Chef has a more complex setup.
Ansible vs. Terraform: Terraform excels at infrastructure as code, while Ansible is better for application-level automation.
Advanced Ansible Techniques
Once you’ve mastered the basics, you can dive into more advanced features like:
Using Variables: Pass data dynamically into your playbooks.
Loops and Conditionals: Add logic to your tasks for more flexibility.
Error Handling: Use blocks and rescue statements to manage failures gracefully.
Ansible Galaxy: Boost Your Efficiency
Ansible Galaxy is a repository for pre-built roles that allow you to speed up your automation. Instead of building everything from scratch, you can leverage roles that the community has shared.
Security Automation with Ansible
Security is a growing concern in IT, and Ansible can help here too. You can automate tasks like:
Security Patches: Keep your systems up-to-date with the latest patches.
Firewall Configuration: Automate firewall rule management.
Monitoring and Logging with Ansible
To ensure that your systems are running smoothly, Ansible can help with monitoring and logging. Integrating tools like ELK (Elasticsearch, Logstash, Kibana) into your playbooks can help you stay on top of system health.
Ansible Best Practices
To ensure your Ansible setup is as efficient as possible:
Structure Your Playbooks: Break large playbooks into smaller, reusable files.
Version Control: Use Git to manage changes.
Document Everything: Make sure your playbooks are well-documented for easy handover and scaling.
Conclusion
Ansible is a powerful automation tool that simplifies everything from configuration management to application deployment. Its simplicity, flexibility, and agentless architecture make it an ideal choice for both small teams and large enterprises. If you're serious about improving your workflows and embracing automation, mastering Ansible is the way forward.
FAQs
What are Ansible's prerequisites?
You need Python installed on both the controller and managed nodes.
How does Ansible handle large infrastructures?
Ansible uses parallelism to manage large infrastructures efficiently.
Can Ansible manage Windows machines?
Yes, Ansible has modules that allow it to manage Windows servers.
Is Ansible free to use?
Ansible is open-source and free, though Ansible Tower is a paid product offering additional features.
How often should playbooks be updated?
Playbooks should be updated regularly to account for system changes, software updates, and security patches.
0 notes
qcs01 · 3 months
Text
Getting Started with OpenShift: Environment Setup
OpenShift is a powerful Kubernetes-based platform that allows you to develop, deploy, and manage containerized applications. This guide will walk you through setting up an OpenShift environment on different platforms, including your local machine and various cloud services.
Table of Contents
1. [Prerequisites]
2. [Setting Up OpenShift on a Local Machine](#setting-up-openshift-on-a-local-machine)
    - [Minishift]
    - [CodeReady Containers]
3. [Setting Up OpenShift on the Cloud]
    - [Red Hat OpenShift on AWS]
    - [Red Hat OpenShift on Azure]
    - [Red Hat OpenShift on Google Cloud Platform]
4. [Common Troubleshooting Tips]
5. [Conclusion]
Prerequisites
Before you begin, ensure you have the following prerequisites in place:
- A computer with a modern operating system (Windows, macOS, or Linux).
- Sufficient memory and CPU resources (at least 8GB RAM and 4 CPUs recommended).
- Admin/root access to your machine.
- Basic understanding of containerization and Kubernetes concepts.
Setting Up OpenShift on a Local Machine
Minishift
Minishift is a tool that helps you run OpenShift locally by launching a single-node OpenShift cluster inside a virtual machine. 
Step-by-Step Guide
1. Install Dependencies
   - VirtualBox: Download and install VirtualBox from [here](https://www.virtualbox.org/).
   - Minishift: Download Minishift from the [official release page](https://github.com/minishift/minishift/releases) and add it to your PATH.
2. Start Minishift
   Open a terminal and start Minishift:
   ```sh
   minishift start
   ```
3. Access OpenShift Console
 Once Minishift is running, you can access the OpenShift console at `https://192.168.99.100:8443/console` (the IP might vary, check your terminal output for the exact address).
   ![Minishift Console](https://example.com/minishift-console.png)
CodeReady Containers
CodeReady Containers (CRC) provides a minimal, preconfigured OpenShift cluster on your local machine, optimized for testing and development.
Step-by-Step Guide
1. Install CRC
   - Download CRC from the [Red Hat Developers website](https://developers.redhat.com/products/codeready-containers/overview).
   - Install CRC and add it to your PATH.
2. Set Up CRC
   - Run the setup command:
     ```sh
     crc setup
     ```
3. Start CRC
   - Start the CRC instance:
     ```sh
     crc start
     ```
4. Access OpenShift Console
   Access the OpenShift web console at the URL provided in the terminal output.
   ![CRC Console](https://example.com/crc-console.png)
Setting Up OpenShift on the Cloud
Red Hat OpenShift on AWS
Red Hat OpenShift on AWS (ROSA) provides a fully-managed OpenShift service.
Step-by-Step Guide
1. Sign Up for ROSA
   - Create a Red Hat account and AWS account if you don't have them.
   - Log in to the [Red Hat OpenShift Console](https://cloud.redhat.com/openshift) and navigate to the AWS section.
2. Create a Cluster
   - Follow the on-screen instructions to create a new OpenShift cluster on AWS.
3. Access the Cluster
   - Once the cluster is up and running, access the OpenShift web console via the provided URL.
   ![ROSA Console](https://example.com/rosa-console.png)
Red Hat OpenShift on Azure
Red Hat OpenShift on Azure (ARO) offers a managed OpenShift service integrated with Azure.
Step-by-Step Guide
1. Sign Up for ARO
   - Ensure you have a Red Hat and Azure account.
   - Navigate to the Azure portal and search for Red Hat OpenShift.
2. Create a Cluster
   - Follow the wizard to set up a new OpenShift cluster.
3. Access the Cluster
   - Use the URL provided to access the OpenShift web console.
   ![ARO Console](https://example.com/aro-console.png)
Red Hat OpenShift on Google Cloud Platform
OpenShift on Google Cloud Platform (GCP) allows you to deploy OpenShift clusters managed by Red Hat on GCP infrastructure.
Step-by-Step Guide
1. Sign Up for OpenShift on GCP
   - Set up a Red Hat and Google Cloud account.
   - Go to the OpenShift on GCP section on the Red Hat OpenShift Console.
2. Create a Cluster
   - Follow the instructions to deploy a new cluster on GCP.
3. Access the Cluster
   - Access the OpenShift web console using the provided URL.
   ![GCP Console](https://example.com/gcp-console.png)
Common Troubleshooting Tips
- Networking Issues: Ensure that your firewall allows traffic on necessary ports (e.g., 8443 for the web console).
- Resource Limits: Check that your local machine or cloud instance has sufficient resources.
- Logs and Diagnostics: Use `oc logs` and `oc adm diagnostics` commands to troubleshoot issues.
Conclusion
Setting up an OpenShift environment can vary depending on your platform, but with the steps provided above, you should be able to get up and running smoothly. Whether you choose to run OpenShift locally or on the cloud, the flexibility and power of OpenShift will enhance your containerized application development and deployment process.
[OpenShift](https://example.com/openshift.png)
For further reading and more detailed instructions, refer to the www.qcsdclabs.com 
0 notes
kubernetesonline · 7 months
Text
Docker Online Training | Visualpath
Docker Machine and Docker Swarm: How They Work?
Introduction:
Docker, with its containerization technology, has revolutionized how applications are built, shipped, and deployed. However, managing Docker containers across various environments can still be a daunting task, especially when dealing with multiple hosts or cloud providers. This is where Docker Machine comes into play, offering a streamlined approach to managing Docker hosts regardless of the underlying infrastructure. - Docker and Kubernetes Training
Docker Machine:
Docker Machine is a tool that enables developers to create and manage Docker hosts on local machines, remote servers, or cloud providers effortlessly. It abstracts away the complexity of setting up Docker environments by automating the provisioning process. With Docker Machine, developers can easily spin up multiple Docker hosts, each with its own configuration, to run containerized applications.  - Kubernetes Online Training
Simplified Deployment Workflow:
One of the key benefits of Docker Machine is its simplified deployment workflow. Instead of manually configuring Docker hosts on different platforms, developers can use Docker Machine to automate the process. By simply running a few commands, they can create Docker hosts on local machines for development or on remote servers for production deployment.
Multi-Cloud Support:
Docker Machine offers support for multiple cloud providers, including Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and more. This allows developers to deploy Docker hosts on their preferred cloud infrastructure without having to deal with the intricacies of each provider's setup process. - Docker Online Training
Scaling with Ease:
Scaling containerized applications can be challenging, especially when dealing with a large number of Docker hosts. Docker Machine simplifies the scaling process by allowing developers to quickly add or remove hosts as needed. Whether it's scaling horizontally to handle increased traffic or vertically to accommodate resource-intensive workloads, Docker Machine provides the tools to effortlessly manage the underlying infrastructure.
Integration with Docker Swarm:
For orchestrating and managing clusters of Docker hosts, Docker Machine seamlessly integrates with Docker Swarm. Developers can use Docker Machine to create Swarm nodes across multiple hosts, enabling high availability and fault tolerance for their applications.
Conclusion:
In conclusion, Docker Machine is a powerful tool for simplifying the management of Docker hosts across various environments. By automating the provisioning process and providing support for multi-cloud deployment, Docker Machine enables developers to focus on building and deploying applications rather than worrying about infrastructure setup.
Visualpath is the Leading and Best Institute for learning Docker And Kubernetes Online in Ameerpet, Hyderabad. We provide Docker Online Training Course, you will get the best course at an affordable cost.
Attend Free Demo
Call on - +91-9989971070.
Visit : https://www.visualpath.in/DevOps-docker-kubernetes-training.html
Blog : https://dockerandkubernetesonlinetraining.blogspot.com/
0 notes
thereactcompany · 9 months
Text
Adding Markers and Custom Icons on Google Maps in React.js
Tumblr media
Google Maps is a powerful tool for displaying and interacting with geographic information in web applications. In this blog post, we will explore how to add markers with custom icons to a Google Map using React.js. Custom markers can help you make your maps more visually appealing and informative.
Prerequisites: Before we begin, make sure you have the following prerequisites in place:
A basic understanding of React.js and JavaScript.
Node.js and npm (Node Package Manager) installed on your development machine.
A Google Cloud Platform (GCP) account with billing enabled and the Maps JavaScript API enabled.
Let’s get started!
Step 1: Set Up a React.js Project If you don’t already have a React.js project, you can create one using Create React App or your preferred React.js project setup.
npx create-react-app custom-marker-map
cd custom-marker-map
npm start
Step 2: Create a Google Maps Component Next, let’s create a React component that will display the Google Map.
// src/components/GoogleMap.js
import React, { Component } from 'react';
class GoogleMap extends Component {
componentDidMount() {
// Load the Google Maps JavaScript API
const script = document.createElement('script');
script.src = `https://maps.googleapis.com/maps/api/js?key=YOUR_API_KEY&libraries=places`;
script.async = true;
script.defer = true;
script.onload = this.initMap;
document.head.appendChild(script);
}
initMap() {
// Initialize the map
const map = new window.google.maps.Map(document.getElementById('map'), {
center: { lat: 37.7749, lng: -122.4194 }, // Set your initial map center coordinates
zoom: 12, // Set the initial zoom level
});
// Add markers to the map
const marker = new window.google.maps.Marker({
position: { lat: 37.7749, lng: -122.4194 }, // Set marker coordinates
map: map,
icon: 'path/to/custom-marker.png', // Path to your custom marker icon
title: 'Custom Marker',
});
}
render() {
return <div id="map" style={{ width: '100%', height: '400px' }}></div>;
}
}
export default GoogleMap;
In this component, we load the Google Maps JavaScript API using a script tag and initialize the map with a specified center and zoom level. We then add a custom marker to the map using the google.maps.Marker class and provide the path to the custom marker icon.
Step 3: Display the Map Component Now, import and render the GoogleMap component in your main App.js file or any other desired location within your React app.
// src/App.js
import React from 'react';
import './App.css';
import GoogleMap from './components/GoogleMap';
function App() {
return (
<div className="App">
<h1>Custom Marker Map</h1>
<GoogleMap />
</div>
);
}
export default App;
Step 4: Customize the Marker Icon To use a custom icon for your marker, replace 'path/to/custom-marker.png' with the path to your custom marker icon image. You can use a PNG or SVG file for your marker.
Step 5: Run Your React App Start your React app by running:
npm start
You should now see a Google Map with a custom marker icon at the specified coordinates.
Conclusion: In this blog post, we’ve learned how to add markers with custom icons to a Google Map in a React.js application. Custom markers can help you personalize your maps and provide valuable information to your users. You can further enhance your map by adding interactivity and additional features, such as info windows, by exploring the Google Maps JavaScript API documentation. Happy mapping!
React Company provides access to a team of experienced React developers who are ready to answer your questions and help you solve any problems you may encounter.
For any inquiries or further assistance, please don’t hesitate to contact us.
For more details you can connect with bosc tech labs.
0 notes
Link
0 notes
ugacomp · 3 years
Text
We built a live streaming infrastructure for a client to serve over 1000 users
Tumblr media
At Ugacomp, we remotely work with clients to offer them technology solutions they need. From remote IT consulting and project assessment to designing and deploying technical projects, we proudly boast of handling 'Everything Tech'
We recently got a remote client from Philippines who wanted us to help him remotely design and deploy a live streaming infrastructure with capacity to serve over 100s of real-time streaming users. Below is a snapshot of some of our discussions we had with this client.
(Critical data about the client has been redacted)
Tumblr media
Now, our client's live streaming infrastructure would run on top of Microsoft Azure, without him having to self-host physical IT infrastructure at his business premises. For those of you who don't know anything about Microsoft Azure, it is an Infrastructure-as-a-Service (IaaS) platform that allows you to remotely deploy and configure both virtual and bare metal servers in the cloud.
In order to get started, we undertook a sequence of activities to execute the configurations as needed. And here is how we went about it;
We setup the appropriate billing and subscription Plan
One of the interesting things that make cloud computing interesting is not having to incur upfront spending like in traditional IT infrastructure deployments. Cloud computing offers cost-effective pricing strategy where you have the options of choosing the right subscription for your business. For the case of our client, we setup Azure's Pay as You Go pricing Plan. This is means that Microsoft Azure would charge the client based on the amount of resources the deployed server infrastructure consumes in a given period of time.
We deployed Ant Media Server Instance
Ant Media Server is an application server designed to handle live video streaming content using User Datagram Protocol (UDP). Ant Media server has two license plans i.e. the free community edition and the enterprise edition license.
We chose the enterprise version because it had all the features and capabilities for enterprise clientele performance setup
Tumblr media
The kind of server instance we deployed was 'compute-optimized' with sufficient computing resources to handle high quality video streaming requests from 100s of simultaneous users. We actually opted for 64 Virtual CPU cores and 128 GiB Memory
Tumblr media
We configured the DNS records
Our client wanted his Live streaming server IP address to be bound to a custom domain name, which he had provided. We had to achieve this by using Azure's DNS Zones' service.
Tumblr media
Using Azure's DNS Zones service, we went ahead and configured the Nameservers and other critical DNS records info required to bind custom domain to the server's IP address.
Tumblr media
We integrated SSL certificate to the Server
We used SSH to run a couple of commands that helped us to achieve SSL installation to the server. The good thing is that Ant Media application server ships with SSL certificate that we had to configure with a couple of commands.
Configuring Firewall rules
Ant Media Server Azure instance is preconfigured with all the necessary firewall configurations. Once you fire up its preconfigured VM instance, all the necessary security configurations will automatically be installed.
Tumblr media
To be sure if everything was working, we fired up our SSH access to the server and checked if ant media was setup and running correctly. We realized everything was fine as we wanted.
Tumblr media
Finally, the client's project was up and running within a day. We tirelessly prioritized our client's work as it was urgently needed to be completed in just one day.
Tumblr media Tumblr media
We beat the deadline by 6 hours. It was such an incredible project experience we've ever handled.
Our client was happy for the good job we did, Up to now, he keeps consulting with us on a couple of IT support projects. We're continuously helping him to solve or troubleshoot tech problems on a remote basis
How we can work with you
We help clients from any part of the world to design, develop, configure, deploy or troubleshoot any IT project. If you're in our geographical proximity, we can work with you onsite. However, if we can't connect with onsite, then we  can use remote-working model to help you execute any tech project you have.
Our services include;
Designing custom Web applications (apps & websites)
Deploying IT infrastructure in the cloud (AWS, Azure, GCP, IBM, Oracle, Alibaba, Digital Ocean, Linode, Contabo, Vultr, SSD Nodes, and more)
Deploying Virtual Desktop environments in the cloud
Configuring VPS servers for web hosting services
IT infrastructure analysis and audit (performance, Security and cost optimization)
Proffessional IT consulting and project assessment
Contractual-based IT support services for businesses or enterprises (onsite and off-site)
Network design and deployment (onsite projects)
Enterprise IT compliance Assessments and Audits
Enterprise technology assessments for businesses and organizations i.e. helping to introduce new tech solutions to your business
Corporate IT trainings for company employees
Onsite IT infrastructure deployment (full-stack configurations and deployments)
Corporate IT Policy formulation and implementation.
Enterprise Software acquisition and licensing
And everything tech as needed by your company or business.
For urgent projects, hire us on Fiverr
We use Fiverr to accept remote hiring because it is a platform that guarantees trust between you; our remote employer and us; the remote employees. After agreeing with us on a particular project, you can pay us through Fiverr. That money will be kept by Fiverr platform. Then, on our side, we will work on your project, complete it and submit it to you. After you've received the completed project as you wanted, Fiverr will release the money to us. If the work isn't done as you wanted, you can ask Fiverr to refund you back your money. It is as secure as that, isn't it? :)
Send us an email; [email protected]. Call us/ WhatsApp us: +256758057003
3 notes · View notes
Quote
by Steef-Jan Wiggers Follow In a recent blog post, Google announced the beta of Cloud AI Platform Pipelines, which provides users with a way to deploy robust, repeatable machine learning pipelines along with monitoring, auditing, version tracking, and reproducibility.  With Cloud AI Pipelines, Google can help organizations adopt the practice of Machine Learning Operations, also known as MLOps – a term for applying DevOps practices to help users automate, manage, and audit ML workflows. Typically, these practices involve data preparation and analysis, training, evaluation, deployment, and more.  Google product manager Anusha Ramesh and staff developer advocate Amy Unruh wrote in the blog post:  When you're just prototyping a machine learning (ML) model in a notebook, it can seem fairly straightforward. But when you need to start paying attention to the other pieces required to make an ML workflow sustainable and scalable, things become more complex. Moreover, when complexity grows, building a repeatable and auditable process becomes more laborious. Cloud AI Platform Pipelines - which runs on a Google Kubernetes Engine (GKE) Cluster and is accessible via the Cloud AI Platform dashboard – has two major parts:  The infrastructure for deploying and running structured AI workflows integrated with GCP services such as BigQuery, Dataflow, AI Platform Training and Serving, Cloud Functions, and The pipeline tools for building, debugging and sharing pipelines and components. With the Cloud AI Platform Pipelines users can specify a pipeline using either the Kubeflow Pipelines (KFP) software development kit (SDK) or by customizing the TensorFlow Extended (TFX) Pipeline template with the TFX SDK. The latter currently consists of libraries, components, and some binaries and it is up to the developer to pick the right level of abstraction for the task at hand. Furthermore, TFX SDK includes a library ML Metadata (MLMD) for recording and retrieving metadata associated with the workflows; this library can also run independently.  Google recommends using KPF SDK for fully custom pipelines or pipelines that use prebuilt KFP components, and TFX SDK and its templates for E2E ML Pipelines based on TensorFlow. Note that over time, Google stated in the blog post that these two SDK experiences would merge. The SDK, in the end, will compile the pipeline and submit it to the Pipelines REST API; the AI Pipelines REST API server stores and schedules the pipeline for execution. An open-source container-native workflow engine for orchestrating parallel jobs on Kubernetes called Argo runs the pipelines, which includes additional microservices to record metadata, handle components IO, and schedule pipeline runs. The Argo workflow engine executes each pipeline on individual isolated pods in a GKE cluster – allowing each pipeline component to leverage Google Cloud services such as Dataflow, AI Platform Training and Prediction, BigQuery, and others. Furthermore, pipelines can contain steps that perform sizeable GPU and TPU computation in the cluster, directly leveraging features like autoscaling and node auto-provisioning.   Source: https://cloud.google.com/blog/products/ai-machine-learning/introducing-cloud-ai-platform-pipelines AI Platform Pipeline runs include automatic metadata tracking using the MLMD - and logs the artifacts used in each pipeline step, pipeline parameters, and the linkage across the input/output artifacts, as well as the pipeline steps that created and consumed them. With Cloud AI Platform Pipelines, according to the blog post customers will get: Push-button installation via the Google Cloud Console Enterprise features for running ML workloads, including pipeline versioning, automatic metadata tracking of artifacts and executions, Cloud Logging, visualization tools, and more  Seamless integration with Google Cloud managed services like BigQuery, Dataflow, AI Platform Training and Serving, Cloud Functions, and many others  Many prebuilt pipeline components (pipeline steps) for ML workflows, with easy construction of your own custom components The support for Kubeflow will allow a straightforward migration to other cloud platforms, as a respondent on a Hacker News thread on Google AI Cloud Pipeline stated: Cloud AI Platform Pipelines appear to use Kubeflow Pipelines on the backend, which is open-source and runs on Kubernetes. The Kubeflow team has invested a lot of time on making it simple to deploy across a variety of public clouds, such as AWS, and Azure. If Google were to kill it, you could easily run it on any other hosted Kubernetes service. The release of AI Cloud Pipelines shows Google's further expansion of Machine Learning as a Service (MLaaS) portfolio - consisting of several other ML centric services such as Cloud AutoML, Kubeflow and AI Platform Prediction. The expansion is necessary to allow Google to further capitalize on the growing demand for ML-based cloud services in a market which analysts expect to reach USD 8.48 billion by 2025, and to compete with other large public cloud vendors such as Amazon offering similar services like SageMaker and Microsoft with Azure Machine Learning. Currently, Google plans to add more features for AI Cloud Pipelines. These features are: Easy cluster upgrades  More templates for authoring ML workflows More straightforward UI-based setup of off-cluster storage of backend data Workload identity, to support transparent access to GCP services, and  Multi-user isolation – allowing each person accessing the Pipelines cluster to control who can access their pipelines and other resources. Lastly, more information on Google's Cloud AI Pipeline is available in the getting started documentation.
http://damianfallon.blogspot.com/2020/03/google-announces-cloud-ai-platform.html
1 note · View note
globalmediacampaign · 3 years
Text
Comparing RonDB 21.04.0 on AWS, Azure and GCP using Sysbench
 Release of RonDB 21.04.0RonDB is based on MySQL NDB Cluster optimised for use in modern cloud settings. Today we launch RonDB 21.04.0. In RonDB 21.04.0 we have integrated benchmark scripts to execute various benchmarks towards RonDB.There are three ways of using RonDB. The first is using the managed version provided by Logical Clocks. This is currently available in AWS and is currently being developed to also support Azure. This is still in limited access mode. To access it contact Logical Clocks at the rondb.com website.The second way is to use a script provided by Logical Clocks that automates the creation of VMs and the installation of the software components required by RonDB. These scripts are available to create RonDB clusters on Azure and GCP (Google Cloud). This script can be downloaded from nexus.hops.works/rondb-cloud-installer.sh.The third manner to use RonDB is to simply download the RonDB binary tarball and install it on any computers of your own liking.All these methods start by visiting http://rondb.com. From here you will find the download scripts, the tarball to download and to send an email request access to the managed version of RonDB.RonDB 21.04.0 can be used in any of the above settings, but we focus our development, testing and optimisations towards executing RonDB in an efficient manner in AWS, Azure and GCP. We will likely add Oracle Cloud eventually to the mix as well.Benchmark SetupWhat we have discovered in our benchmarking is that even with very similar HW there are some differences in how RonDB performs on the different clouds. So this report presents the results using very similar setups in AWS, Azure and GCP.Above we have the benchmark setup used in all the experiments. There are always 2 RonDB data nodes and they are replicas of each other. Thus all write operations are written on both data nodes to ensure that we are always available even in the presence of node failures.The MySQL Servers are pure clients since data is located on the RonDB data nodes. Thus we can easily scale the benchmark using any number of MySQL Servers. The benchmark application runs on a single VM that sends SQL queries to the MySQL Servers and receives results using a MySQL client written in C. It is sufficient to have a single Sysbench server for these experiments.In this experiment we will scale RonDB data nodes by using different VM types. It is also possible to scale RonDB by adding more RonDB data nodes. Both of these changes can be performed without any downtime.It is possible to execute the Sysbench server local to the MySQL Server and let multiple Sysbench servers execute in parallel. This would however be a 2-tiered cluster and we wanted to test a 3-tiered cluster setup since we think this is the most common setup used. Obviously a 2-tiered cluster setup will have lower latency, but it will also be more complex to maintain.There is also a RonDB management server in the setup, however this is not involved in the benchmark execution and is either located in the Sysbench server or a separate dual CPU VM. Availability ZonesAWS, Azure and GCP all use a concept called Availability Zones. These are located in the same city, but can be distant from each other. The latency between Availability Zones can be more than 1 millisecond in latency for each jump. RonDB contains options to optimise for such a setup, but in this test we wanted to test a setup that is within an Availability Zone.Thus all setups we ensured that all VMs participating in cluster setup were in the same zone. Even within a zone the variance on the latency can be substantial. We see this in that the benchmark numbers can vary even within the same cloud and the same availability zone on different runs. From other reports it is reported that network latency is around 40-60 microseconds between VMs in the same availability zone. Our experience is that it is normal that this latency varies at least 10-20 microseconds up or down. In Azure it is even possible that the variance is higher since they can implement some availability zones in multiple buildings. In this case Azure provides a concept called Proximity Placement Groups that can be used to ensure that VMs are located in the same building and not spread between buildings in the same availability zone.RonDB VM TypesAll cloud vendors have VMs that come from different generations of SW and HW. For a latency sensitive application like RonDB this had serious implications. All the VMs we tested used very similar Intel x86 CPUs. There is some difference in performance between older Intel x86 and newer CPUs. However this difference is usually on the order of 30-40%, so not so drastic.However an area where innovation has happened at a much higher pace is networking. Cloud vendors have drastically improved the networking latency, bandwidth and efficiency from generation to generation.What we found is that it is essential to use the latest VM generation for MySQL Servers and RonDB data nodes. The difference between the latest generation and the previous generation was up to 3x in latency and performance. We found that the latest generation of VMs from all cloud vendors have similar performance, but using older versions had a high impact on the benchmark results. All the benchmarking results in this report uses the latest generation of VMs from all vendors.For AWS this means using their 5th generation VMs. AWS has three main categories of VMs, these c5, m5 and r5. c5 VMs are focused on lots of CPU and modest amounts of memory. m5 VMs twice as much memory with the same amount of CPU and r5 have 4x more memory than the c5 and the same amount of CPU. For RonDB this works perfectly fine. The RonDB data nodes store all the data and thus require as much memory as possible. Thus we use the r5 category here. MySQL Servers only act as clients in RonDB setup, thus require only a modest amount of memory, thus we use the c5 category here.The latest generation in Azure is the v4 generation. Azure VMs have two categories, the D and E VMs. The E category has twice as much memory as the D category. The E category is similar to AWS r5 and the D category is similar to the AWS m5 category.The latest generation in GCP is the n2 generation. They have n2-highcpu that matches AWS c5, n2-standard that matches AWS m5 and n2-highmem that matches AWS r5. GCP also has the ability to extend memory beyond 8 GB per CPU which is obviously interesting for RonDB.Benchmark Notes on Cloud VendorsSince we developed the RonDB managed version on AWS we have a bit more experience from benchmarking here. We quickly discovered that the standard Sysbench OLTP RW benchmark actually is not only a CPU benchmark. It is very much a networking benchmark as well. In some benchmarks using 32 VCPUs on the data nodes, we had to send around 20 Gb/sec from the data node VMs. Not all VM types could handle this. In AWS this meant that we had to use a category called r5n. This category uses servers that have 100G Ethernet instead of 25G Ethernet and thus a 32 VCPU VM was provided with bandwidth up to 25G. We didn’t investigate this thoroughly on Azure and GCP.Some quirks we noted was that the supply of Azure v4 VM instances was somewhat limited. In some regions it was difficult to succeed in allocating a set of large v4 VM instances. In GCP we had issues with our quotas and got a denial to increase the quota size for n2 VMs, which was a bit surprising. This meant that we executed not as many configurations on Azure and GCP. Thus some comparisons are between Azure and AWS only.Using the latest VM generation AWS, Azure and GCP all had reasonable performance. There were differences of course, but between 10-30% except in one benchmark. Our conclusion is that AWS, Azure and GCP have used different strategies in how to handle networking interrupts. AWS reports the lowest latency on networking in our tests and this is also seen in other benchmark reports. However GCP shows both in our benchmarks and other similar reports to have higher throughput but worse latency. Azure falls in between those.Our conclusion is that it is likely caused by how network interrupts are handled. If the network interrupts are acted upon immediately one gets the best possible latency. But at high loads the performance goes down since interrupt handling costs lots of CPU. If network interrupts are instead handled using polling the latency is worse, but at high loads the cost of interrupts stays low even at extreme loads.Thus best latency is achieved through handling interrupts directly and using polling one gets better performance the longer the delay in the network interrupt. Obviously the true answer is a lot more complex than this, but suffice it to say that the cloud vendors have selected different optimisation strategies that work well in different situations.Benchmark Notes on RonDBOne more thing that affects latency of RonDB to a great extent is the wakeup latency of threads. Based on benchmarks I did while at Oracle I concluded that wakeup latency is about 2x higher on VMs compared to on bare metal. On VMs it can be as high as 25 microseconds, but is likely nowadays to be more like on the order of 10-15 microseconds.RonDB implements adaptive CPU spinning. This ensures that latency is decreasing when the load increases. This means that we get a latency curve that starts a bit higher, then goes down until the queueing for CPU resources starts to impact latency and after that it follows a normal latency where latency increases as load increases.Latency variations are very small up to about 50% of the maximum load on RonDB.In our benchmarks we have measured the latency that 95% of the transactions were below. Thus we didn’t focus so much on single outliers. RonDB is implementing soft real-time, thus it isn’t intended for hard real-time applications where life depends on individual transactions completing in time.The benchmarks do however report a maximum latency. Most of the time these maximum latencies were as expected. But one outlier we saw, this was on GCP where we saw transaction latency at a bit above 200 ms when executing benchmarks with up to 8 threads. These outliers disappeared when going towards higher thread counts. Thus it seems that GCP VMs have some sort of deep sleep that keeps them down for 200 ms. This latency was always in the range 200-210 milliseconds. Thus it seemed that there was a sleep of 200ms somewhere in the VM. In some experiments on Azure we saw even higher maximum latency with similar behaviour as on GCP. So it is likely that most cloud vendors (probably all) can go into deep sleeps that highly affect latency when operations start up again.Benchmark ConfigurationOk, now on to numbers. We will show results from 4 different setups. All setups use 2 data nodes. The first setup uses 2 MySQL Servers and both RonDB data nodes and MySQL Servers use VMs with 16 VCPUs. This setup mainly tests latency and performance of MySQL Servers in an environment where data nodes are not overloaded. This test compares AWS, Azure and GCP.The second setup increases the number of MySQL Servers to 4 in the same setup. This makes the data node the bottleneck in the benchmark. This benchmark also compares AWS, Azure and GCP.The third setup uses 16 VPUs on data nodes and 2 MySQL Servers using 32 VCPUs. This test shows performance in a balanced setup where both data nodes and MySQL Servers are close to their limit. This test compares AWS and Azure.The final setup compares a setup with 32 VCPUs on data nodes and 3 MySQL Servers using 32 VCPUs. This setup mainly focuses on behaviour latency and throughput of MySQL Servers in an environment where the data nodes are not the bottleneck. The test compares AWS with Azure.We used 3 different benchmarks. Standard Sysbench OLTP RW, this benchmark is both a test of CPU performance as well as networking performance. Next benchmark is the same as OLTP RW using a filter where the scans only return 1 of the 100 scanned rows instead of all of them. This makes the benchmark more CPU focused.The final benchmark is a key lookup benchmark that only sends SQL queries using IN statements. This means that each SQL query performs 100 key lookups. This benchmark shows the performance of simple key lookups using RonDB through SQL queries.ConclusionsThe results show clearly that AWS has the best latency numbers at low to modest loads. At high loads GCP gives the best results. Azure has similar latency to GCP, but doesn’t provide the same benefits at higher loads. These results are in line with similar benchmark reports comparing AWS, Azure and GCP.The variations from one benchmark run to another run can be significant when it comes to latency. This is natural since there is a random latency added dependent on how far apart the VMs are within the availability zone. However throughput is usually not affected in the same manner.In some regions Azure uses several buildings to implement one availability zone, this will affect latency and throughput negatively. In those regions it is important to use Proximity Placement Groups in Azure to ensure that all VMs are located in the same building. The effect of this is seen in the last benchmark results in this report.The limitations on VM networking are a bit different. This played out as a major factor in the key lookup benchmark where one could see that AWS performance was limited due to network bandwidth limitation. Azure VMs had access to a higher networking bandwidth for similar VM types.AWS provided the r5n VM types, this provided 4x more networking bandwidth with the same CPU and memory setup. This provided very useful for benchmarking using RonDB data nodes with 32 VCPUs.Benchmark Results2 Data Nodes@16 VCPUs, 2 MySQL Server@16 VCPUsStandard OLTP RWIn this benchmark we see clearly the distinguishing features of AWS vs GCP. AWShas better latency at low load. 6,5 milliseconds compared to 9,66 milliseconds.However GCP reaches higher performance. At 128 threads it reaches 7% higherperformance at 7% lower latency. So GCP focuses on the performance at high loadwhereas AWS focuses more on performance at lower loads. Both approaches haveobvious benefits, which is best is obviously subjective and depends on the application.This benchmark is mainly testing the throughput of MySQL Servers. The RonDBdata nodes are only loaded to about 60-70% of their potential throughput with2 MySQL Servers.Moving to latency numbers one can see the same story, but even clearer. AWS hasa better latency up to 48 threads where the latency of GCP becomes better. In GCPwe see that the latency at 1 thread is higher than the latency at 12 threads and onlyat 24 threads the latency starts to increase beyond the latency at 1 thread. Thus inGCP the latency is very stable over different loads until the load goes beyond 50%of the possible throughput. We see the same behaviour on Azure whereas AWSlatency slowly starts to increase at lower thread counts.Standard OLTP RW using filterThe OLTP RW using a filter is more focused on CPU performance. The major differenceis seen at higher loads. The latency at low loads is very similar, but at higher loads weget higher throughput at lower latency. Thus standard OLTP RW has a steeper marchfrom acceptable latency to high latency. The difference in throughput is very smallbetween cloud vendors, it is within 10%.The comparison between AWS and GCP is similar though. The GCP benefit at higherload is slightly higher and similar to the latency. The AWS advantage at lower loads isslightly lower. Thus GCP has a slight advantage compared to standard OLTP RW,but it is a very small difference.Key LookupsIn the graph below we see the number of key lookups that 2 MySQL Servers can drive.The numbers are very equal for the different cloud vendors. AWS as usual has anadvantage at lower thread counts and GCP gains the higher numbers at higherthread counts and Azure is usually in the middle.The latency numbers are shown below. These numbers more clearly show theadvantage of AWS at lower thread counts. At higher thread counts the latencyis mostly the same for all cloud vendors. This benchmark is extremely regularin its use case and thus it is mostly the CPU performance that matters in thisbenchmark. Since this is more or the less same on all cloud vendors we seeno major difference.2 Data Nodes@16 VCPUs, 4 MySQL Server@16 VCPUsIn this benchmark the bottleneck moves to the RonDB data nodes. We now havesufficient amounts of MySQL Servers to make the RonDB data nodes a bottleneck.This means a bottleneck that can be both a CPU bottleneck as well as a networkingbottleneck.Standard OLTP RWThe latency is very stable until we reach 64 threads where we have around 15k TPS at20 milliseconds latency. At higher thread counts the data nodes becomes the bottleneckand in this case the latency has a much higher variation. We can even see that latencyat 128 threads in Azure goes down and throughput up. We expect that this is due tointerrupt handling being executed on the same CPUs as database processing happens.This is something that we will look more into.OLTP RW using filterThe throughput of OLTP with a filter means that the focus is more on CPU performance.This makes it clear that the high variation on throughput and latency in standard OLTP RWcomes from handling the gigabytes per second of data to send to the MySQL Servers.In this benchmark the throughput increases in a stable manner and similarly the latencygoes up in an expected manner.All cloud vendors are very close to each other except at low thread counts whereAWS have an advantage.Key LookupsThe key lookups with 4 MySQL Server and 2 data nodes and all nodes using16 VCPUs per node moves the bottleneck to the data node. As usual AWSwins out on the latency at lower thread counts. But at higher thread countsAWS hits a firm wall. Most likely it hits a firm bandwidth limitation on the VMs.This limitation is higher on Azure, thus these VM can go an extra mile and serve1 million more key lookups per second.2 Data Nodes@16 VCPUs, 2 MySQL Server@32 VCPUsThis benchmark uses the same amount of CPUs on the MySQL Server side,but instead of divided on 4 MySQL Servers, it is using 2 MySQL Servers.We didn’t test GCP in this configuration. We expect no surprises in throughputand latency if we do.Standard OLTP RWIn the Standard OLTP RW we see that the throughput is the same as with4 MySQL Servers. However the throughput increases in a more regular manner.What we mainly see is that we can achieve a higher throughput using a smalleramount of threads in total. This makes the throughput more stable. Thus weconclude that at least up to 32 VCPUs it pays off to use larger MySQL Serversif required.2 Data Nodes@32 VCPUs, 3 MySQL Server@32 VCPUsIn this benchmark we increased the number of CPUs on the RonDB datanodes to 32 VCPUs. Most of the testing in this setup has been performedon AWS. The main reason for including the Azure numbers is becausethese numbers show the impact of not using Proximity Placement Groupsin Azure on large regions. We saw clearly in these benchmarks that thelatency in the Azure setup was much higher than in previous benchmarksthat were using a smaller region.However in the smaller region it was difficult to allocate these larger VMsin any larger number. We constantly got failures due to lacking resourcesto fulfil our requests.Standard OLTP RWIn AWS we discovered that the network was a bottleneck when executingthis benchmark. Thus we used r5n.8xlarge instead of r5.8xlarge VMs inthis benchmark. These VMs reside in machines with 100G Ethernetconnections and each 32 VCPU VM have access to at least 25 Gb/secnetworking. The setup tested here with 3 MySQL Servers doesn’t load theRonDB data node fully. In other benchmarks we were able to increasethroughput to around 35k TPS. However these benchmarks used a differentsetup, so these numbers are not relevant for a comparison. What we see isthat the throughput in this case is roughly twice the throughput when using16 VCPUs in the data nodes.Latency numbers look very good and it is clear that we haven't reallyreached the bottleneck really in neither the MySQL Servers nor theRonDB data nodes.OLTP RW using filterSimilarly in this experiment we haven’t really reached the bottleneck on neither theRonDB data nodes nor the MySQL Servers. So no real news from this benchmark. http://mikaelronstrom.blogspot.com/2021/04/comparing-rondb-21040-on-aws-azure-and.html
0 notes
qcs01 · 3 months
Text
Managing OpenShift Clusters: Best Practices and Tools
Introduction
Brief overview of OpenShift and its significance in the Kubernetes ecosystem.
Importance of effective cluster management for stability, performance, and security.
1. Setting Up Your OpenShift Cluster
Cluster Provisioning
Steps for setting up an OpenShift cluster on different platforms (bare metal, cloud providers like AWS, Azure, GCP).
Using OpenShift Installer for automated setups.
Configuration Management
Initial configuration settings.
Best practices for cluster configuration.
2. Monitoring and Logging
Monitoring Tools
Using Prometheus and Grafana for monitoring cluster health and performance.
Overview of OpenShift Monitoring Stack.
Logging Solutions
Setting up EFK (Elasticsearch, Fluentd, Kibana) stack.
Best practices for log management and analysis.
3. Scaling and Performance Optimization
Auto-scaling
Horizontal Pod Autoscaler (HPA).
Cluster Autoscaler.
Resource Management
Managing resource quotas and limits.
Best practices for resource allocation and utilization.
Performance Tuning
Tips for optimizing cluster and application performance.
Common performance issues and how to address them.
4. Security Management
Implementing Security Policies
Role-Based Access Control (RBAC).
Network policies for isolating workloads.
Managing Secrets and Configurations
Securely managing sensitive information using OpenShift secrets.
Best practices for configuration management.
Compliance and Auditing
Tools for compliance monitoring.
Setting up audit logs.
5. Backup and Disaster Recovery
Backup Strategies
Tools for backing up OpenShift clusters (e.g., Velero).
Scheduling regular backups and verifying backup integrity.
Disaster Recovery Plans
Creating a disaster recovery plan.
Testing and validating recovery procedures.
6. Day-to-Day Cluster Operations
Routine Maintenance Tasks
Regular updates and patch management.
Node management and health checks.
Troubleshooting Common Issues
Identifying and resolving common cluster issues.
Using OpenShift diagnostics tools.
7. Advanced Management Techniques
Custom Resource Definitions (CRDs)
Creating and managing CRDs for extending cluster functionality.
Operator Framework
Using Kubernetes Operators to automate complex application deployment and management.
Cluster Federation
Managing multiple OpenShift clusters using Red Hat Advanced Cluster Management (ACM).
Conclusion
Recap of key points.
Encouragement to follow best practices and continuously update skills.
Additional resources for further learning (official documentation, community forums, training programs).
By covering these aspects in your blog post, you'll provide a comprehensive guide to managing OpenShift clusters, helping your readers ensure their clusters are efficient, secure, and reliable.
For more details click www.qcsdclabs.com
0 notes
freeudemycourses · 4 years
Photo
Tumblr media
[FREE] Setup Single Node Cloudera Cluster on Google Cloud What you Will learn ? Set up One Machine Cloudera Data Platform (CDP) Cluster on Google Cloud (GCP)
0 notes
Text
Software Engineer, Backend - Remote
Tumblr media
At Mnemonic, we are excited to be building the ultimate foundational data layer for the Web3 industry, doing the hard work once so that everyone else can focus on building the most amazing and inspiring products possible. NFTs are already flipping the script on longstanding systems of ownership on the Internet and empowering creatives and brands to engage fans in new ways. This is just the very beginning. We simplify the increasingly-complex task of reading, searching, aggregating, and analyzing massive amounts of data on chain and off chain, so developers can bring better products to market faster. By empowering the innovators, we’re empowering the people. If you are passionate about data platforms, information retrieval, search, large-scale infrastructure, blockchain, web3 and the future of the Internet Mnemonic could be for you! The Role The scope of the work is broad, but generally includes: - Designing and developing foundational backend and data infrastructure that supports our platform. - Designing and developing scalable blockchain indexers, data aggregators and micro-services. - Identifying and establishing best-in-class engineering practices. What We Are Looking For - At least 2 years of professional experience. - Good understanding and proficiency in computer architecture, data structures and algorithms. - Experience of working with and contributing to highly scalable distributed systems and micro-services. - Understanding of system performance analyses and monitoring. - Writing high-quality maintainable code. Professional experience with Golang is preferred. - Ability to move quickly while managing trade-offs of performance, reliability, security, and code quality. - A low-ego, growth oriented mindset, with a bias to thoughtful action, curiosity, self-direction and team play. - A bachelors or master’s degree in Computer Science, Computer Engineering, or Mathematics is strongly preferred. Nice To Haves - Experience with blockchains. - Experience with one or more of the following: Postgres, Kubernetes, Terraform, GCP. Our Tech Stack - Backend: Golang, gRPC, Protobuf, REST. - Frontend: Typescript, Node, Next.js, React. - Storage: Postgres, Redis, Memcached, Pub/Sub, BigQuery, BigTable, CloudStorage. - Infrastructure: Kubernetes, Terraform, Docker, GCP, AWS. - Observability: Prometheus, Grafana, Jaeger. - CI/CD: GitHub Actions. - General: GitHub, Slack, Linear, PagerDuty. Perks at Mnemonic - 100% remote, with company sponsored team gatherings a few times a year. - Competitive salary and equity packages of an early stage fast-growing startup. - Few meetings so you can focus on building. - Top-notch health/dental/vision with $0 premium for employees and highly subsidized for families. - Short-term disability, long-term disability, and life insurance. - New top Apple equipment. - Home internet stipend. - One time work from home stipend to get your workspace setup. APPLY ON COMPANY WEBSITE Disclaimer:  - This job opening is available on the respective company website as of 15th May 2023. The job openings may get expired by the time you check the post. - Candidates are requested to study and verify all the job details before applying and contact the respective company representative in case they have any queries. - The owner of this site has provided all the available information regarding the location of the job i.e. work from anywhere, work from home, fully remote, remote, etc. However, if you would like to have any clarification regarding the location of the job or have any further queries or doubts; please contact the respective company representative. Viewers are advised to do full requisite enquiries regarding job location before applying for each job.   - Authentic companies never ask for payments for any job-related processes. Please carry out financial transactions (if any) at your own risk. - All the information and logos are taken from the respective company website. Read the full article
0 notes
kureeeen · 5 years
Text
Kubernetes
Kubernetes 를 공부하면서 했던 메모.
Kubernetes
今こそ始めよう!Kubernetes 入門
History
Google 사내에서 이용하던 Container Cluster Manager “Borg” 에 착안하여 만들어진 Open Source Software (OSS)
2014년 6월 런칭
2015년 7월 version 1.0.
version 1.0 이후 Linux Foundation 의 Could Native Computing Foundation (CNCF) 로 이관되어 중립적 입장에서 개발
version 1.7 Production-Ready
De facto standard
2014년 11월 Google Cloud Platform (GCP) 가 Google Container Engine (GKE, 후에 Google Kuebernetes Engine) 제공 시작
2017년 2월 Microsoft Azure 가 Azure Container Service (AKS) 릴리즈
2017년 11월 Amazon Web Service (AWS) 가 Amazon Elastic Container Service for Kubernetese (Amazon EKS) 릴리즈
Kubernetes 로 가능한 일
Docker 를 Product 레벨에서 이용하기 위해서 고려해야 했던 점들
복수의 Docker Host 관리
Container 의 Scheduling
Rolling-Update
Scaling / Auto Scaling
Monitoring Container Live/Dead
Self Healing
Service Discovery
Load Balancing
Manage Data
Manage Workload
Manage Log
Infrastructure as Code
그 외 Ecosystem과의 연계와 확장
위 문제들을 해결하기 위해 Kubernetes 가 탄생
Kubernetes 에서는 YAML 형식 manifesto 사용
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: sample-deployment spec: replicas: 3 selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - name: nginx-container image: nginx:latest ports: - containerPort: 80 ` **Kubernets 는,**
복수의 Docker Host 를 관리해서 container cluster 를 구축
같은 container 의 replica 로 실행하여 부하 분산과 장애에 대비 가능
부하에 따라 container 의 replica 수를 조절 (auto scaling) 가능
Disk I/O, Network 통신량 등의 workload 나 ssd, cpu 등의 Docker Host spec에 따라서 Container 배치가 가능
GCP / AWS / OpenStack 등에서 구축할 경우, availability zone 등의 부가 정보로 간단히 multi region 에 container 배치 가능
기본적으로 CPU, Memory 등의 자원 상황에 따라 scaling
자원 부족 등의 경우 Kubernetes cluster auto scaling 이용 가능
container process 감시
container process 가 멈추면 self healing
HTTP/TCP, Shell Script 등을 이용한 Health Check 도 가능
특정 Container 군에 대해 Load Balancing 적용 가능
기능별로 세분화된 micro service architecture 에 필요한 service discovery 가능
Container 와 Service 의 데이터는 Backend 의 etcd 에 보존
Container 에서 공통적으로 설정이나 Application 에서 사용하는 데이터베이스의 암호 등의 정보를 Kubernetes Cluster 에서 중앙 관리 가능
Kubernetes 를 지원
Ansible : Deploy container to Kubernetes
Apache Ignite : Kubernetes 의 Service Discovery 기능을 이용한 자동 cluster 구성과 scaling
Fluentd : Kubernetes 상의 Container Log 를 전송
Jenkins : Deploy container to Kubernetes
OpenStack : Cloud 와 연계된 Kubernetes 구축
Prometheus : Kubernetes 감시
Spark : job 을 Kubernetes 상에서 Native 실행 (YARN 대체)
Spinnaker : Deploy container to Kubernetes
etc…
Kubernetes 에는 기능 확장이 가능하도록 되어 있어 독자적인 기능을 구현하는 것도 가능
Kubernetes 구축 환경 선택
개인 Windows / Mac 상에 로컬 Kubernetes 환경을 구축
구축 툴을 사용한 cluster 구축
public cloud 의 managed Kubernetes 를 이용
환경에 따라서 일부 이용 불가한 기능도 있으나 기본적으로 어떤 환경에서도 동일한 동작이 가능하도록 CNCF 가 Conformance Program 을 제공
Local Kubernetes
Minikube
VirtualBox 필요 (xhyve, VMware Fusion 도 이용 가능)
Homebrew 등을 이용한 설치 가능
Install
`$ brew update $ brew install kubectl $ brew cask install virtualbox $ brew install minikube `
Run
minikube 기동 시, 필요에 따라 kubernetes 버전을 지정 가능 --kubernetes-version
`$ minikube start —kubernetes-version v1.8.0 `
Minikube 용으로 VirtualBox 상에 VM 가 기동될 것이고 kubectl 로 Minikube 의 클러스터를 조작하는 것이 가능
상태 확인
`$ minikube status `
Minikube cluster 삭제
`$ minikube delete `
Docker for Mac
DockerCon EU 17 에 Docker 사에서 Kubernetes support 발표
Kubernetes 의 CLI 등에서 Docker Swarm 을 조작하는 등의 연계 기능 강화
17.12 CE Edge 버전부터 로컬에 Kubernetes 를 기동하는 것이 가능
Kubernetes 버전 지정은 불가
Docker for Mac 설정에서 Enable Kubernetes 지정
이후 kubectl 로 cluster 조작 가능
`$ kubectl config use-context docker-for-desktop `
kubectl 상에선 Docker Host 가 node로 인식
`$ kubectl get nodes `
Kubernetes 관련 component가 container 로서 기동
`$ docker ps --format 'table {{.Image}}\t{{.Command}}' | grep -v pause `
Kubernetes 구축 Tool
kubeadm
Kubernetes 가 공식적으로 제공하는 구축 도구
여기서는 Ubuntu 16.04 기준으로 기록 (환경 및 필요 버전에 따라 일부 변경 필요함)
준비
`apt-get update && apt-get install -y apt-transport-https curl -s https://package.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - cat /etc/aptsources.list.d/kubernetes.list deb http://apt.kubernetes.io/ kubernetes-xenial main EOF apt-get update apt-get install -y kubelet=1.8.5-00 kubeadm=1.8.5-00 kubectl=1.8.5-00 docker.io sysctl net.bridge.bridge-nf-call-iptables=1 `
Master node 를 위한 설정
--pod-network-cidr은 cluster 내 network (pod network) 용으로 Flannel을 이용하기 위한 설정
`$ kubeadm init --pod-network-cidr=10.244.0.0/16 `
위 설정 명령으로 마지막에 Kubernetes node 를 실행하기 위한 명령어가 출력되며 이후 node 추가시에 실행한다.
`$ kubeadm join --token ... 10.240.0.7:6443 --discovery-token-ca-cert-hash sha256:... `
kubectl 에서 사용할 인증 파일 준비
`$ mkdir -p $HOME/.kube $ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config $ sudo chown $(id -u):$(id -g) $HOME/.kube/config `
Flannel deamon container 기동
`$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml `
Flannel 이 외에도 다른 선택이 가능 Installing a pod network add-on
Rancher
Rancher Labs 사
Open Source Container Platform
version 1.0 에서는 Kubernetes 도 서포트 하는 형식
version 2.0 부터는 Kubernetes 를 메인으로
Kubernetes cluster 를 다양한 플랫폼에서 가능 (AWS, OpenStack, VMware etc..)
기존의 Kubernetes cluster 를 Rancher 관리로 전환 가능
중앙집중적인 인증, 모니터링, WebUI 등의 기능을 제공
풍부한 Application Catalog
Rancher Server 기동
`docker run -d --restart=unless-stopped -p 8080:8080 rancher/server:v2.0.0-alpha10 `
이 Rancher Server 에서 각 Kubernetes cluster 의 관리와 cloud provider 연계 등을 수행
etc
Techtonic (CoreOS)
Kubespray
kops
OpenStack Magnum
Public Cloud managed Kubernetes
GKE (Google Kubernetes Engine)
많은 편리한 기능을 제공
GCP (Google Cloud Platform) 와 Integration.
HTTP LoadBalancer (Ingress) 사용 가능
NodePool
GUI or gcloud 명령어 사용
cluster version 간단 update
GCE (Google Compute Engine) 를 사용한 cluster 구축 가능
Container 를 사용하여 Kubernetes 노드가 재생성되어도 서비스에 영향을 미치지 않게 설계 가능
Kubernetes cluster 내부의 node 에 label 을 붙여 Group 화 가능
Group 화 하여 Scheduling 에 이용 가능
cloud 명령어로 cluster 구축
`$ gcloud container clusters create example-cluster `
인증 정보 저장
`$ gcloud container clusters get-credentials example-cluster `
etc
Google Kubernetes Engine
AKS (Azure Container Service)
Azure Container Service
EKS (Elastic Container Service for Kubernetes)
Amazon EKS
Kubernetes 기초
Kubernetes 는 실제로 Docker 이외의 container runtime 을 이용한 host 도 관리할 수 있도록 되어 있다. Kubernetes = Kubernetes Master + Kubernetes Node
Kubernetes Master
Kubernetes Node
API endpoint 제공
container scheduling
container scaling
Docker Host 처럼 실제로 container 가 동작하는 host
Kubernetes cluster 를 조작할 땐, CLI tool 인 kubectl 과 YAML 형식 manifest file 을 사용하여 Kubernetes Master 에 resource 등록 kubectl 도 내부적으로는 Kubernetes Master API 를 사용 = Library, curl 등을 이용한 조작도 가능
Kubernetes & Resource
resource 를 등록하면 비동기로 container 실행과 load balancer 작성된다. Resource 종류에 따라 YAML manifest 에 사용되는 parameter 가 상이
Kubernetes API Reference Docs
Kubernetes Resource
Workloads : container 실행에 관련
Discovery & LB : container 외부 공개 같은 endpoint를 제공
Config & Storage : 설정, 기밀정보, Persistent volume 등에 관련
Cluster : security & quota 등에 관련
Metadata : resource 조작
Workloads
cluster 상의 container 를 기동하기 위해 이용 내부적으로 이용하는 것을 제외한 이용자가 직접 이용하는 것으로는 다음과 같은 종류
Pod
ReplicationController
ReplicaSet
Deployment
DaemonSet
StatefulSet
Job
CronJob
Discovery & LB
container 의 service discovery, endpoint 등을 제공 내부적으로 이용하는 것을 제외한 이용자가 직접 이용하는 것으로는 다음과 같은 종류
Service : endpoint 의 제공방식에 따라 복수의 타입이 존재
Ingress
ClusterIP
NodePort
LoadBalancer
ExternalIP
ExternalName
Haedless
Config & Storage
설정이나 기밀 데이터 등을 container 에 넣거나 Persistent volume을 제공
Secret
ConfigMap
PersistentVolumeClaim Secret 과 ConfigMap 은 key-value 형식의 데이터 구조
Cluster
cluster 의 동작을 정의
Namespace
ServiceAccount
Role
ClusterRole
RoleBinding
ClusterRoleBinding
NetworkPolicy
ResourceQuota
PersistentVolume
Node
Metadata
cluster 내부의 다른 resource 동작을 제어
CustomResourceDefinition
LimitRange
HorizontalPodAutoscaler
Namespace 에 따른 가상 cluster 의 분리
Kubernetes 가상 cluster 분리 기능 (완전 분리는 아님) 하나의 Kubernetes cluster 를 복수 팀에서 이용 가능하게 함 Kubernetes cluster 는 RBAC (Role-Based Access Control) 이 기본 설정으로 Namesapce 를 대상으로 권한 설정을 할 수 있어 분리성을 높이는 것이 가능
초기 상태의 3가지 Namespace
default
kube-system : Kubernetes cluster 의 component와 addon 관련
kube-public : 모두가 사용 가능한 ConfigMap 등을 배치
CLI tool kubectl & 인증 정보
kubectl 이 Kubernetes Master 와 통신하기 위해 접속 서버의 정보와 인증 정보 등이 필요. 기본으로는 `~/.kube/config` 에 기록된 정보를 이용 `~/.kube/config` 도 YAML Manifest `~/.kube/config` example <pre>`apiVersion: v1 kind: Config preferences: {} clusters: - name: sample-cluster cluster: server: https://localhost:6443 users: - name: sample-user user: client-certificate-data: agllk5ksdgls2... client-key-data: aglk14l1t1ok15... contexts: - name: sample-context context: cluster: sample-cluster namespace: default user: sample-user current-context: sample-context `</pre>
`~/.kube/config` 에는 기본적으로 cluster, user, context 3가지를 정의 cluster : 접속하기 위한 cluster 정보 user : 인증 정보 context : cluster 와 user 페어에 namespace 지정 kubectl 를 사용한 설정 <pre>`# 클러스터 정의 $ kubectl config set-cluster prd-cluster --server=https://localhost:6443 # 인증정보 정의 $ kubectl config set-credentials admin-user \ --client-certificate \ --client-key=./sample.key \ --embed-certs=true # context(cluster, 인증정보, Namespace 정의) $ kubectl config --set-context prd-admin \ --cluster=prd-cluster \ --user=admin-user \ --namespace=default `</pre>
context 를 전환하는 것으로 복수의 cluster 와 user 를 사용하는 것이 가능
`# context 전환 $ kubectx prd-admin Switched to context "prd-admin". # namespace 전환 $ kubens kube-system Context "prd-admin" is modified. Active Namespace is "kube-system". `
## kubectl & YAML Manifest YAML Manifest 를 사용한 container 기동
pod 작성
`# sample-pod.yml apiVersion: vi kind: Pod metadata: name: sample-pod spec: containers: - name: nginx-container image: nginx:1.12 `
resource 작성
`# create resource $ kubectl create -f sample-pod.yml `
resource 삭제
`# delete resource $ kubectl delete -f sample-pod.yml `
resource update
`# apply 외 set, replace, edit 등도 사용 가능 $ kubectl apply -f sample-pod.yml `
## kubectl 사용법
resource 목록 획득 (get)
`$ kubectl get pods # 획득한 목록 상세 출력 $ kubectl get pods -o wide `
-o, —output 옵션을 사용하여 JSON / YAML / custom-columns / Go Template 등 다양한 형식으로 출력하는 것이 가능. 그리고 상세한 정보까지 확인 가능. pods 를 all 로 바꾸면 모든 리소스 일람 획득
resource 상세 정보 확인 (describe)
`$ kubectl describe pods sample-pod $ kubectl describe node k15l1 `
get 명령어 보다 resource 에 관련한 이벤트나 더 상세한 정보를 확인 가능
로그 확인 (logs)
`# Pod 내 container 의 로그 출력 $ kubectl logs sample-pod # 복수 container 가 포함된 Pod 에서 특정 container 의 로그 출력 $ kubectl logs sample-pod -c nginx-container # log follow option -f $ kubectl logs -f sample-pod # 최근 1시간, 10건, timestamp 표시 $ kubectl logs --since=1h --tail=10 --timestamp=true sample-pod `
Pod 상의 특정 명령 실행 (exec)
`# Pod 내 container 에서 /bin/sh $ kubectl exec -it sample-pod /bin/sh # 복수 container 가 포함된 Pod 의 특정 container 에서 /bin/sh $ kubectl exec -it sample-pod -c nginx-container /bin/sh # 인수가 있는 명령어의 경우, -- 이후에 기재 $ kubectl exec -it sample-pod -- /bin/ls -l / `
port-forward
`# localhost:8888 로 들어오는 데이터를 Pod의 80 포트로 전송 $ kubectl port-forward sample-pod 8888:80 # 이후 localhost:8888 을 통해 Pod의 80 포트로 접근 가능 $ curl localhost:8888 `
shell completion
`# bash $ source
## Kubernetes Workloads Resource ## Workloads Resource
cluster 상에서 container 를 기동하기 위해 이용
8 종류의 resource 존재
Pod
ReplicationController
ReplicaSet
Deployment
DaemonSet
StatefulSet
Job
CronJob
디버그, 확인 용도로 주로 이용
ReplicaSet 사용 추천
Pod 을 scale 관리
scale 관리할 workload 에서 기본적으로 사용 추천
각 노드에 1 Pod 씩 배치
Persistent Data 나 stateful 한 workload 의 경우 사용
work queue & task 등의 container 종료가 필요한 workload 에 사용
정기적으로 Job을 수행
Pod
Kubernetes Workloads Resource 의 최소단위
1개 이상의 container 로 구성
Pod 단위로 IP Address 가 할당
대부분의 경우 하나의 Pod은 하나의 container 를 포함하는 경우가 대부분
proxy, local cache, dynamic configure, ssh 등의 보조 역할을 하는 container 를 같이 포함 하는 경우도 있다.
같은 Pod 에 속한 container 들은 같은 IP Address
container 들은 localhost 로 서로 통신 가능
Network Namespace 는 Pod 내에서 공유
보조하는 sub container 를 side car 라고 부르기도 한다.
Pod 작성
sample pod 을 작성하는 pod_sample.ymlapiVersion: v1 kind: Pod metadata: name: sample-pod spec: containers: - name: nginx-container image: nginx:1.12 ports: - containerPort: 80
nginx:1.12 image를 사용한 container 가 하나에 80 포트를 개방
설정 파일을 기반으로 Pod 작성
`$ kubectl apply -f ./pod_sample.yml `
기동한 Pod 확인
`$ kubectl get pods # 보다 자세한 정보 출력 $ kubectl get pods --output wide `
**2 개의 container 를 포함한 Pod 작성**
2pod_sample.yml
`apiVersion: v1 kind: Pod metadata: name: sample-2pod spec: containers: - name: nginx-container-112 image: nginx:1.12 ports: - containerPort: 80 - name: nginx-container-113 image: nginx:1.13 ports: - containerPort: 8080 `
**container 내부 진입**
container 의 bash 등을 실행하여 진입
`$ kubectl exec -it sample-pod /bin/bash `
-t : 모의 단말 생성
-i : 표준입력 pass through
ReplicaSet / ReplicationController
Pod 의 replica 를 생성하여 지정한 수의 Pod을 유지하는 resource
초창기 ReplicationController 였으나 ReplicaSet 으로 후에 변경됨
ReplicationController 는 equality-based selector 이용. 폐지 예정.
ReplicaSet 은 set-based selector 이용. 기본적으로 이를 이용할 것.
ReplicaSet 작성
sample ReplicaSet 작성 (rs_sample.yml)
`apiVersion: apps/v1 kind: ReplicaSet metadata: name: sample-rs spec: replicas: 3 selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 ports: - containerPort: 80 `
ReplicaSet 작성
`$ kubectl apply -f ./rs_sample.yml `
ReplicaSet 확인
`$ kubectl get rs -o wide `
Label 지정하여 Pod 확인
`$ kubectl get pod -l app=sample-app -o wide `
**Pod 정지 & auto healing**
auto healing = ReplicaSet 은 node 나 pod 에 장애가 발생해도 pod 수를 지정한 수만큼 유지되도록 별도의 node 에 container 를 기동해주기에 장애에 대비하여 영향을 최소화할 수 있도록 가능하다.
Pod 삭제
`$ kubectl delete pod sample-rs-7r6sr `
Pod 삭제 후 다시 Pod 확인 하면 ReplicaSet 이 새로 Pod 이 생성된 것을 확인 가능
ReplicaSet 의 Pod 증감은 kubectl describe rs 명령어로 이력을 확인 가능
Label & ReplicaSet
ReplicaSet 은 Kubernetes 가 Pod 을 감시하여 수를 조정
감시하기 위한 Pod Label 은 spec.selector 에서 지정
특정 라벨이 붙은 Pod 의 수를 세는 것으로 감시
부족하면 생성, 초과하면 삭제
`selector: matchLabels: app: sample-app `
생성되는 Pod Label 은 labels 에 정의.
spec.template.metadata.labels 의 부분에도 app:sample-app 식으로 설정이 들어가서 Label 가 부여된 상태로 Pod 이 생성됨.
`labels: app: sample-app `
spec.selector 와 spec.template.metadata.labels 가 일치하지 않으면 Pod 이 끝없이 생성되다가 에러가 발생하게 될 것…
ReplicaSet 을 이용하지 않고 외부에서 별도로 동일한 label 을 사용하는 Pod 을 띄우면 초과한 수만큼의 Pod 을 삭제하게 된다. 이 때, 어느 Pod 이 지워지게 될지는 알 수 없으므로 주의가 필요
하나의 container 에 복수 label 을 부여하는 것도 가능
`labels: env: dev codename: system_a role: web-front `
**Pod scaling**
yaml config 을 수정하여 kubectl apply -f FILENAME 을 실행하여 변경된 설정 적용
kubectl scale 명령어로 scale 처리
scale 명령어로 처리 가능한 대상은
Deployment
Job
ReplicaSet
ReplicationController
`$ kubectl scale rs sample-rs --replicas 5 `
## Deployment
복수의 ReplicaSet 을 관리하여 rolling update 와 roll-back 등을 실행 가능
방식
전환 방식
Kubernetes 에서 가장 추천하는 container 의 기동 방법
새로운 ReplicaSet 을 작성
새로운 ReplicaSet 상의 Replica count 를 증가시킴
오래된 ReplicaSet 상의 Replica count 를 감소시킴
2, 3 을 반복
새로운 ReplicaSet 상에서 container 가 기동하는지, health check를 통과하는지 확인하면서
ReplicaSet 을 이행할 때의 Pod 수의 상세 지정이 가능
Deployment 작성
deployment_sample.yml
`apiVersion: apps/v1 kind: Deployment metadata: name: sample-deployment spec: replicas: 3 selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 ports: - containerPort: 80 `
deployment 작성
`# record 옵션을 사용하여 update 시 이력을 보존 가능 $ kubectl apply -f ./deployment_sample.yml --record `
이력은 metadata.annotations.kubernetes.io/change-cause에 보존
현재 ReplicaSet 의 Revision 번호는 metadata.annotations.deployment.kubernetes.io/revision에서 확인 가능
`$ kubectl get rs -o yaml | head `
kubectl run 으로 거의 같은 deployment 를 생성하는 것도 가능
다만 default label run:sample-deployment 가 부여되는 차이 정도
`$ kubectl run sample-deployment --image nginx:1.12 --replicas 3 --port 80 `
deployment 확인
`$ kubectl get deployment $ kubectl get rs $ kubectl get pods `
container update
`# nginx container iamge 버전을 변경 $ kubectl set image deployment sample-deployment nginx-container=nginx:1.13 `
**Deployment update condition**
Deployment 에서 변경이 있으면 ReplicaSet 이 생성된다.
replica 수는 변경 사항 대상에 포함되지 않는다
생성되는 Pod 의 내용 변경이 대상
spec.template 의 변경이 있으면 ReplicaSet 을 신규 생성하여 rolling update 수행
spec.template이하의 구조체 해쉬값을 계산하여 그것을 이용해 label 을 붙이고 관리를 한다.
`# Deployment using hash value $ kubectl get rs sample-deployment-xxx -o yaml `
**Roll-back**
ReplicaSet 은 기본적으로 이력으로서 형태가 남고 replica 수를 0으로 하고 있다.
변경 이력 확인 kubectl rollout history
`$ kubectl rollout history deployment sample-deployment `
deployment 작성 시 —record 를 사용하면 CHANGE_CAUSE 부분의 값도 존재
roll-back 시 revision 값 지정 가능. 미지정시 하나 전 revision 사용.
`# 한 단계 전 revision (default --to-revision = 0) $ kubectl rollout undo deployment sample-deployment # revision 지정 $ kubectl rollout undo deployment sample-deployment --to-revision 1 `
roll-back 기능보다 이전 YAML 파일을 kubectl apply로 적용하는게 더 편할 수 있음.
spec.template을 같은 걸로 돌리면 Template Hash 도 동일하여 kubectl rollout 과 동일한 동작을 수행하게 된다.
Deployment Scaling
ReplicaSets 와 동일한 방법으로 kubectl scale or kubectl apply -f을 사용하여 scaling 가능
보다 고급진 update 방법
recreate 라는 방식이 존재
DaemonSet
ReplicaSet 의 특수한 형식
모든 Node 에 1 pod 씩 배치
모든 Node 에서 반드시 실행되어야 하는 process 를 위해 이용
replica 수 지정 불가
2 pod 씩 배치 불가
ReplicaSet 은 각 Kubernetes Node 상에 상황에 따라 Pod 을 배치하는 것이기에 균등하게 배포된다는 보장이 없다.
DaemonSet 작성
ds_sample.yml
`apiVersion: apps/v1 kind: DaemonSet metadata: name: sample-ds spec: selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 ports: - containerPort: 80 `
DaemonSet 작성
`$ kubectl apply -f ./ds_sample.yml `
확인
`$ kubectl get pods -o wide `
## StatefulSet
ReplicaSet 의 특수한 형태
database 처럼 stateful 한 workloads 에 대응하기 위함
생성되는 Pod 명이 숫자로 indexing
persistent 성
sample-statefulset-1, sample-statefulset-2, …
PersistentVolume을 사용하는 경우 같은 disk 를 이용하여 재작성
Pod 명이 바뀌지 않음
StatefulSet 작성
spec.volumeClaimTemplates 지정 가능
statefulset-sample.yml
persistent data 영역을 재사용하여 pod 이 복귀했을 때 동일 데이터를 사용하여 container 가 작성되도록 가능
`apiVersion: apps/v1 kind: StatefulSet metadata: name: sample-statefulset spec: replicas: 3 selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 ports: - containerPort: 80 volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: www spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi `
StatefulSet 작성
`$ kubectl apply -f ./statefulset_sample.yml `
확인 (ReplicaSet 과 거의 동일한 정보)
`$ kubectl get statefulset # Pod 이름에 연속된 수로 index 가 suffix 된 것을 확인 $ kubectl get pods -o wide `
scale out 시 0, 1, 2 의 순으로 만들어짐
scale in 시 2, 1, 0 의 순으로 삭제
StatefulSet Scaling
ReplicaSets 와 동일 kubectl scale or kubectl apply -f
Persistent 영역 data 보존 확인
`$ kubectl exec -it sample-statefulset-0 ls /usr/share/nginx/html/sample.html ls: cannot access /usr/share/nginx/html/sample.html: No such file or directory $ kubectl exec -it sample-statefulset-0 touch /usr/share/nginx/html/sample.html $ kubectl exec -it sample-statefulset-0 ls /usr/share/nginx/html/sample.html /usr/share/nginx/html/sample.html `
kubectl 로 Pod 삭제를 하던지 container 내부에서 Exception 등이 발생하는 등으로 container 가 정지해도 file 이 사라지지 않는다.
Pod 명이 바뀌지 않아도 IP Address 는 바뀔 수 있다.
Life Cycle
ReplicaSet 과 다르게 복수의 Pod 을 병렬로 생성하지 않고 1개씩 생성하여 Ready 상태가 되면 다음 Pod 을 작성한다.
삭제 시, index 가 가능 큰 (최신) container 부터 삭제
index:0 이 Master 가 되도록 구성을 짤 수 있다.
Job
container 를 이용하여 일회성 처리를 수행
병렬 실행이 가능하면서 지정한 횟수만큼 container 를 실행 (정상종료) 하는 것을 보장
Job 을 이용 가능한 경우와 Pod 과의 차이
Pod 이 정지하는 것을 전제로 만들어져 있는가?
Pod, ReplicaSets 에서 정지=예상치 못한 에러
Job 은 정지=정상종료
patch 등의 처리에 적합
Job 작성
job_sample.yml : 60초 sleep
ReplicaSets 와 동일하게 label 과 selector 를 지정가능하지만 kubernetes 에서 자동으로 충돌하지 않도록 uuid 를 자동 생성함으로 굳이 지정할 필요 없다.
`apiVersion: batch/v1 kind: Job metadata: name: sample-job spec: completions: 1 parallelism: 1 backoffLimit: 10 template: spec: containers: - name: sleep-container image: centos:latest command: ["sleep"] args: ["60"] restartPolicy: Never `
Job 작성
`$ kubectl apply -f job_sample.yml `
Job 확인
`$ kubectl get jobs $ kubectl get pods `
**restartPolicy**
spec.template.spec.restartPolicy 에는 OnFailure or Never 지정 가능
Never : 장애 시 신규 Pod 작성
OnFailure : 장애 시 동일 Pod 이용하여 Job 재개 (restart count 가 올라간다)
Parallel Job & work queue
completions : 실행 횟수
parallelism : 병렬수
backoffLimit : 실패 허용 횟수. 미지정 시 6
1개씩 work queue 형태로 실행할 경우 completions 를 미지정
parallelism만 지정하면 Persistent 하게 Job을 계속 실행
deployment 등과 동일하게 kubectl scale job… 명령으로 나중에 제어 하는 것도 가능
CronJob
ScheduledJob -> CronJob 으로 명칭 변경됨
Cron 처럼 scheduling 된 시간에 Job 을 생성
create CronJob
cronjob_sample.yml : 60초 마다 30초 sleep
`apiVersion: batch/v1beta1 kind: CronJob metadata: name: sample-cronjob spec: schedule: "*/1 * * * *" concurrencyPolicy: Forbid startingDeadlineSeconds: 30 successfulJobHistoryLimit: 5 failedJobsHistoryLimit: 5 jobTemplate: spec: template: spec: containers: - name: sleep-container image: centos:latest command: ["sleep"] args: ["30"] restartPolicy: Never `
create
`$ kubectl apply -f cronjob_sample.yml `
별도 설정없이 kubectl run —schedule 로 create 가능
`$ kubectl run sample-cronjob --schedule = "*/1 * * *" --restart Never --image centos:latest -- sleep 30 `
확인
`$ kubectl get cronjob $ kubectl get job `
**일시 정지**
spec.suspend 가 true 로 설정되어 있으면 schedule 대상에서 제외됨
YAML 을 변경한 후 kubectl apply
kubectl patch 명령어로도 가능
`$ kubectl patch cronjob sample-cronjob -p '{"spec":{"suspend":true}}' `
kubectl patch에서는 내부적으로 HTTP PATCH method 를 사용하여 Kubernetes 독자적인 Strategic Merge Patch 가 수행된다.
실제로 수행되는 request 를 확인하고 싶으면 -v (Verbose) 옵션 사용
`$ kubectl -v=10 patch cronjob sample-cronjob -p '{"spec":{"suspend":true}}' `
kubectl get cronjob 에서 SUSPEND 항목이 True 로 된 것을 확인
다시 scheduling 대상에 넣고 싶으면 spec.suspend 를 false 로 설정
동시 실행 제어
spec.concurrencyPolicy
spec.startingDeadlineSeconds : Kubernetes Master 가 일시적으로 동작 불가한 경우 등 Job 개시 시간이 늦어졌을 때 Job 을 개시 허용할 수 있는 시간(초)를 지정
spec.successfulJobsHistoryLimit : 보존하는 성공 Job 수.
spec.failedJobsHistoryLimit : 보존하는 실패 Job 수.
Allow (default) : 동시 실행과 관련 제어 하지 않음
Forbid : 이전 Job 이 종료되지 않았으면 새로운 Job 을 실행하지 않음.
Replace : 이전 Job 을 취소하고 새로운 Job 을 실행
300 의 경우, 지정된 시간 보다 5분 늦어도 실행 가능
기본으론 늦어진 시간과 관계없이 Job 생성 가능
default 3.
0 은 바로 삭제.
default 3.
0 은 바로 삭제
Discovery & LB resource
cluster 상의 container 에 접근할 수 있는 endpoint 제공과 label 이 일치하는 container 를 찾을 수 있게 해줌
2 종류가 존재
Service : Endpoint 의 제공방법이 다른 type 이 몇가지 존재
Ingress
ClusterIP
NodePort
LoadBalancer
ExternalIP
ExternalName
Headless (None)
Cluster 내 Network 와 Service
Kubernetes 에서 cluster 를 구축하면 Pod 을 위한 Internal 네트워크가 구성된다.
Internal Network 의 구성은 CNI (Common Network Interface) 라는 pluggable 한 module 에 따라 다르지만, 기본적으로는 Node 별로 상이한 network segment 를 가지게 되고, Node 간의 traffic 은 VXLAN 이나 L2 Routing 을 이용하여 전송된다.
Kubernetes cluster 에 할당된 Internal network segment 는 자동적으로 분할되어 node 별로 network segment 를 할당하기 때문에 의식할 필요 없이 공통의 internal network 를 이용 가능하다.
이러한 특징으로 기본적으로 container 간 통신이 가능하지만 Service 기능을 이용함으로써 얻을 수 있는 이점이 있다.
Pod 에 발생하는 traffic 의 load balancing
Service discovery & internal dns
위 2가지 이점은 모든 Service Type 에서 이용 가능
Pod Traffic 의 load balancing
Service 는 수신한 traffic을 복수의 Pod 에 load balancing 하는 기능을 제공
Endpoint 의 종류에는 cluster 내부에서 이용 가능한 VIP (Virtual IP Address) 와 외부의 load balancer 의 VIP 등 다양한 종류가 제공된다
example) deployment_sample.yml
Deployment 로 복수 Pod 이 생성되면 제각각 다른 IP Address 를 가지게 되는데 이대로는 부하분산을 이용할 수 없지만 Service 가 복수의 Pod 을 대상으로 load balancing 가능한 endpoint 를 제공한다.
`apiVersion: apps/v1 kind: Deployment metadata: name: sample-deployment spec: replicas: 3 selector: matchLabels: app: sample-app template: metadata: labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 ports: - containerPort: 80 `
`$ kubectl apply -f deployment_sample.yml `
Deployment 로 생성된 Pod 의 label 과 pod-template-hash 라벨을 사용한다.
`$ kubectl get pods sample-deployment-5d.. -o jsonpath='{.metadata.labels}' map[app:sample-app pod-template-hash:...]% `
전송할 Pod 은 spec.selector 를 사용하여 설정 (clusterip_sample.yml)
`apiVersion: v1 kind: Service metadata: name: sample-clusterip spec: type: ClusterIP ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 selector: app: sample-app `
`$ kubectl apply -f clusterip_sample.yml `
Service 를 만들면 상세정보의 Endpoint 부분에 복수의 IP Address 와 Port 가 표시된다. 이는 selector 에 지정된 조건에 매칭된 Pod 의 IP Address 와 Port.
`$ kubectl describe svc sample-clusterip `
Pod 의 IP 를 비교하기 위해 특정 JSON path 를 column 으로 출력
`$ kubectl get pods -l app=sample-app -o custom-columns="NAME:{metadata.name},IP:{status.podIP}" `
load balancing 확인을 쉽게 위해 테스트로 각 pod 상의 index.html 을 변경.
Pod 의 이름을 획득하고 각각의 pod 의 hostname 을 획득한 것을 index.html 에 기록
`for PODNAME in `kubectl get pods -l app=sample-app -o jsonpath='{.items[*].metadata.name}'`; do kubectl exec -it ${PODNAME} -- sh -c "hostname > /usr/share/nginx/html/index.html"; done `
일시적으로 Pod 을 기동하여 Service 의 endpoint 로 request
`$ kubectl run --image=centos:7 --restart=Never --rm -i testpod -- curl -s http://[load balancer ip]:[port] `
**Service Discovery 와 Internal DNS**
Service Discovery
Service Discovery 방법
Service 가 제공
특정 조건의 대상이 되는 member 를 열거하거나, 이름으로 endpoint 를 판별
Service 에 속하는 Pod 을 열거하거나 Service 이름으로 endpoint 정보를 반환
A record 를 이용
SRV record 를 이용
환경변수를 이용
A record 를 이용한 Service Discovery
Service 를 만들면 DNS record 가 자동적으로 등록된다
내부적으로는 kube-dns 라는 system component가 endpoint 를 대상으로 DNS record를 생성
Service 를 이용하여 DNS 명을 사용할 수 있으면 IP Address 관련한 관리나 설정을 신경쓰지 않아도 되기 때문에 이를 사용하는 것이 편리
`# IP 대신에 sample-clusterip 라는 Service 명을 이용 가능 $ kubectl run --image=centos:7 --restart=Never --rm -i testpod -- curl -s http://sample-clusterip:8080 `
실제로 kube-dns 에 등록되는 정식 FQDN 은 [Service name].[Namespace name].svc.[ClusterDomain name]
`# container 내부에서 sample-clusterip.default.svc.cluster.local 를 조회 $ kubectl run --image=centos:6 --restart=Never --rm -i testpod -- dig sample-clusterip.default.svc.cluster.local `
FQDN 에서는 Service 충돌을 방지하기 위해 Namespace 등이 포함되어 있으나 container 내부의 /etc/resolv.conf 에 간략한 domain 이 지정되어 있어 실제로는 sample-clusterip.default 나 sample-clusterip 만으로도 조회가 가능하다.
IP 로도 FQDN 을 반대로 조회하는 것도 가능
`$ kubectl run --image=centos:6 --restart=Never --rm -i testpod -- dig -x 10.11.245.11 `
**SRV record 를 이용한 Service Discovery**
[_Service Port name].[_Protocol].[Service name].[Namespace name].svc.[ClusterDomain name]의 형식으로도 확인 가능
`$ kubectl run --image=centos:6 --restart=Never --rm -i testpod -- dig _http-port._tcp.sample-clusterip.default.svc.cluster.local SRV `
**환경변수를 이용한 Service Discovery**
Pod 내부에서는 환경변수로도 같은 Namespace 의 서비스가 확인 가능하도록 되어 있다.
‘-‘ 가 포함된 서비스 이름은 ‘_’ 로, 그리고 대문자로 변환된다.
docker --links ...와 같은 형식으로 환경변수가 보존
container 기동 후에 Service 추가나 삭제에 따라 환경변수가 갱신되는 것은 아니라서 예상 못한 사고가 발생할 가능성도 있다.
Service 보다 Pod 이 먼저 생성된 경우에는 환경변수가 등록되어 있지 않기에 Pod 을 재생성해야 한다.
Docker 만으로 이용하던 환경에서 이식할 때에도 편리
`$ kubectl exec -it sample-deployment-... env | grep -i sample_clusterip `
**복수 port 를 사용하는 Service 와 Service Discovery**
clusterip_multi_sample.yml
`apiVersion: v1 kind: Service metadata: name: sample-clusterip spec: type: ClusterIP ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 - name: "https-port" protocol: "TCP" port: 8443 targetPort: 443 selector: app: sample-app `
## ClusterIP
가장 기본
Kubernetes cluster 내부에서만 접근 가능한 Internal Network 의 VIP 가 할당된다
ClusterIP 로의 통신은 node 상에서 동작하는 system component 인 kube-proxy가 Pod을 대상으로 전송한다. (Proxy-mode 에 따라 상이)
Kubernetes cluster 외부에서의 접근이 필요없는 곳에 이용한다
기본으로는 Kubernetes API 에 접속하기 위한 Service 가 만들어져 있고 ClusterIP 가 를 사용한다.
`# TYPE 의 ClusterIP 확인 $ kubectl get svc `
**create ClusterIP Service**
clusterip_sample.yml
`apiVersion: v1 kind: Service metadata: name: sample-clusterip spec: type: ClusterIP ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 selector: app: sample-app `
type: ClusterIP 지정
spec.ports[x].port 에는 ClusterIP로 수신하는 Port 번호
spec.ports[x].targetPort 는 전달할 container 의 Port 번호
Static ClusterIP VIP 지정
database 를 이용하는 등 기본적으로는 Kubernetes Service에 등록된 내부 DNS record를 이용하여 host 지정하는 것을 추천
수동으로 지정할 경우 spec.clusterIP를 지정 (clusterip_vip_sample.yml)
`apiVersion: v1 kind: Service metadata: name: sample-clusterip spec: type: ClusterIP clusterIP: 10.11.111.11 ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 selector: app: sample-app `
이미 ClusterIP Service가 생성되어 있는 상태에서는 ClusterIP 를 변경할 수 없다.
kubectl apply 로도 불가능
먼저 생성된 Service 를 삭제해야 한다.
ExternalIP
특정 Kubernetes Node 의 IP:Port 로 수신한 traffic 을 container 로 전달하는 방식으로 외부와 연결
create ExternalIP Service
externalip_sample.yml
`apiVersion: v1 kind: Service metadata: name: sample-externalip spec: type: ClusterIP externalIPs: - 10.1.0.7 - 10.1.0.8 ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 selector: app: sample-app `
type: ExternalIP 가 아닌 type: ClusterIP 인 것에 주의
spec.ports[x].port 는 ClusterIP 로 수신하는 Port
spec.ports[x].targetPort 는 전달할 container 의 Port
모든 Kubernetes Node 를 지정할 필요는 없음
ExternalIP 에 이용 가능한 IP Address 는 node 정보에서 확인
GKE 는 OS 상에서는 global IP Address가 인식되어 있지 않아서 이용 불가
`# IP address 확인 $ kubectl get node -o custom-columns="NAME:{metadata.name},IP:{status.addresses[].address}" `
ExternalIP Service 를 생성해도 container 내부에서 사용할 ClusterIP 도 자동적으로 할당된다.
container 안에서 DNS로 ExternalIP Service 확인
`$ kubectl run --image=centos:6 --restart=Never --rm -i testpod -- dig sample-externalip.default.svc.cluster.local `
ExternalIP 를 이용하는 node 의 port 상태
`$ ss -napt | grep 8080 `
ExternalIP 를 사용하면 Kubernetes cluster 밖에서도 접근이 가능하고 또한 Pod 에 분산된다.
NodePort
모든 Kubernetes Node 의 IP:Port 에서 수신한 traffic 을 container 에 전송
ExternalIP Service 의 모든 Node 버전 비슷한 느낌
Docker Swarm 의 Service 를 Expose 한 경우와 비슷
create NodePort Service
nodeport_sample.yml
`apiVersion: v1 kind: Service metadata: name: sample-nodeport spec: type: NodePort ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 nodePort: 30080 selector: app: sample-app `
spec.ports[x].port 는 ClusterIP 로 수신하는 Port
spec.ports[x].targetPort 는 전달할 container Port
spec.ports[x].nodePort 는 모든 Kubernetes Node 에서 수신할 Port
container 안에서 통신에 사용할 ClusterIP 도 자동적으로 할당된다
지정하지 않으면 자동으로 비어있는 번호를 사용
Kubernetes 기본으로는 이용할 수 있는 번호가 30000~32767
복수의 NodePort 가 동일 Port 사용 불가
Kubernetes Master 설정에서 변경 가능
`$ kubectl get svc `
container 안에서 확인하면 내부 DNS 가 반환하는 IP Address 는 External IP 가 아닌 Cluster IP
`$ kubectl run --image=centos:6 --restart=Never --rm -i testpod -- dig sample-nodeport.default.svc.cluster.local `
Kubernetes Node 상에서 Port 상태를 확인하면 nodePort 에 지정한 값으로 Listen
`$ ss -napt | grep 30080 `
ExternalIP 와 다르게 모든 Node 의 IP Address 로 Kubernetes Cluster 외부에서 접근 가능하며 Pod 으로의 request 도 분산된다.
GKE 도 GCE에 할당된 global IP Address 로 접근 가능
Node 간 통신의 배제 (Node 를 건넌 load balancing 배제)
NodePort 에서는 Node 상의 NodePort 에 도달한 packet 은 Node 를 건너서도 load balancing 이 이루어진다
DaemonSet 등을 사용하면 각 Node에 1 Pod 이 존재하기에 같은 Node 상의 Pod 에만 전달하고 싶을 때 사용
spec.externalTrafficPolicy 를 사용하여 실현 가능
externalTrafficPolicy 를 Cluster 에서 Local 로 변경하기 위해선 YAML 이용 (nodeport_local_sample.yml)
Cluster (Default)
Local
Node 에 도달한 후 각 Node 에 load balancing
실제로는 kube-proxy 설정으로 iptables 의 proxy mode를 사용할 경우, 자신의 Node 에 좀 더 많이 전달되도록 되는 것으로 보여짐 (iptables-save 등으로 statistics 부분을 확인)
도달한 Node 에 속한 Pod 에 전달 (no load balancing)
만약 Pod 이 존재하지 않으면 Response 불가
만일 Pod 이 복수 존재한다면 균등하게 분배
`apiVersion: v1 kind: Service metadata: name: sample-nodeport-local spec: type: NodePort externalTrafficPolicy: Local ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 nodePort: 30081 selector: app: sample-app `
## LoadBalancer
가장 사용성이 좋고 실용적
Kubernetes Cluster 외부의 LoadBalancer 에 VIP 를 할당 가능
NodePort 등에선 결국 Node 에 할당된 IP Address 에 endpoint 역할까지 담당시키는 것이라 SPoF (Single Point of Failure) 로 Node 장애에 약하다
외부의 load balancer 를 이용하는 것으로 Kubernetes Node 장애에 강하다
단 외부 LoadBalancer 와 연계 가능한 환경으로 GCP, AWS, Azure, OpenStack 등의 CloudProvider 에 한정된다 (이는 추후에 점차 확대될 수 있다)
NodePort Service 를 만들어서 Cluster 외부의 Load Balancer 에서 Kubernetes Node 에 balancing 한다는 느낌
create LoadBalancer Service
lb_sample.yml
`apiVersion: v1 kind: Service metadata: name: sample-lb spec: type: LoadBalancer ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 nodePort: 30082 selector: app: sample-app `
spec.ports[x].port 는 LoadBalancer VIP 와 ClusterIP 로 수신하는 Port
spec.ports[x].targetPort 는 전달할 container Port
NodePort 도 자동적으로 할당됨으로 spec.ports[x].nodePort 의 지정도 가능
확인
`$ kubectl get svc sample-lb `
EXTERNAL-IP 가 pending 상태인 경우 LoadBalancer 가 준비되는데 시간이 필요한 경우
Container 내부 통신에는 Cluster IP 를 사용하기에 ClusterIP 도 자동할당
NodePort 도 생성
VIP 는 Kubernetes Node 에 분산되기 때문에 Kubernetes Node 의 scaling 시 변경할 것이 없다
Node 간 통신 배제 (Node 를 건넌 load balancing 배제)
NodePort 와 동일하게 externalTrafficPolicy 를 이용 가능
LoadBalancer VIP 지정
spec.LoadBalancerIP 로 외부의 LoadBalancer IP Address 지정 가능
`# lb_fixip_sample.yml apiVersion: v1 kind: Service metadata: name: sample-lb-fixip spec: type: LoadBalancer loadBalancerIP: xxx.xxx.xxx.xxx ports: - name: "http-port" protocol: "TCP" port: 8080 targetPort: 80 nodePort: 30083 selector: app: sample-app `
미지정 시 자동 할당
GKE 등의 cloud provider 의 경우 주의
GKE 에서는 LoadBalancer service 를 생성하면 GCLB 가 생성된다.
GCP LoadBalancer 등으로 비용이 증가 되지 않도록 주의
IP Address 중복이 허용되거나 deploy flow 상 문제가 없으면 가급적 Service 를 정리할 것
Service 를 만든 상태에서 GKE cluster 를 삭제하면 GCLB 가 과금 되는 상태로 남아버리므로 주의
Headless Service
Pod 의 IP Address 가 반환되는 Service
보통 Pod 의 IP Address 는 자주 변동될 수 있기 때문에 Persistent 한 StatefulSet 한정하여 사용 가능
IP endpoint 를 제공하는 것이 아닌 DNS Round Robin (DNS RR) 을 사용한 endpoint 제공
DNS RR 은 전달할 Pod 의 IP Address 가 cluster 안의 DNS 에서 반환되는 형태로 부하분산이 이루어지기에 client 쪽 cache 에 주의할 필요가 있다.
기본적으로 Kubernetes 는 Pod 의 IP Address 를 의식할 필요가 없도록 되어 있어서 Pod 의 IP Address 를 discovery 하기 위해서는 API 를 사용해야만 한다
Headless Service 를 이용하여 StatefulSet 한정으로 Service 경유로 IP Address 를 discovery 하는 것이 가능하다
create Headless Service
3가지 조건이 충족되어야 한다
Service 의 spec.type 이 ClusterIP
Service 의 metadata.name 이 StatefulSet 의 spec.serviceName 과 같을 것
Service 의 spec.clusterIP 가 None 일것
위 조건들이 충족되지 않으면 그냥 Service 로만 동작하고 Pod 이름을 얻는 등이 불가능하다.
`# headless_sample.yml apiVersion: v1 kind: Service metadata: name: sample-svc spec: type: ClusterIP clusterIP: None ports: - name: "http-port" protocol: "TCP" port: 80 targetPort: 80 selector: app: sample-app `
**Headless Service 를 이용한 Pod 이름 조회**
보통 Service 를 만들면 복수 Pod 에 대응하는 endpoint 가 만들어져 해당 endpoint. 에 대응하여 이름을 조회하는 것이 가능하지만 각각의 Pod 의 이름 조회는 불가능하다
보통 Service 의 이름 조회는 [Service name].[Namespace name].svc.[domain name] 로 조회가 가능하도록 되어 있지만 Headless Service 로 그대로 조회하면 DNS Round Robin 으로 Pod 중의 IP 가 반환되기에 부하 분산에는 적합하지 않다
StatefulSet 의 경우에만, [Pod name].[Service name].[Namespace name].svc.[domain name] 형식으로 Pod 이름 조회가 가능하다
container 의 resolv.conf 등에 search 로 entry 가 들어가 있다면 [Pod name].[Service name] 혹은 [Pod name].[Service name].[Namespace] 등으로 조회 가능
ReplicaSet 등의 Resource 에서도 가능
ExternalName
다른 Service 들과 다르게 Service 이름 조회에 대응하여 CNAME 을 반환하는 Service
주로 다른 이름을 설정하고 싶거나 cluster 안에서 endpoint 를 전환하기 쉽게 하고 싶을 때 사용
create ExternalName service
`# externalname_sample.yml apiVersion: v1 kind: Service metadata: name: sample-externalname namespace: default spec: type: ExternalName externalName: external.example.com `
`$ kubectl get svc `
EXTERNAL-IP 부분에 CNAME 용의 DNS 가 표시된다
container 내부에서 [Service name] 이나 [Service name].[Namespace name].svc.[domain name] 으로 조회하면 CNAME 가 돌아오는 것을 확인 가능
`$ dig sample-externalname.default.svc.cluster.local CNAME `
**Loosely Coupled with External Service**
Cluster 내부에서는 Pod 로의 통신에 Service 이름 조회를 사용하는 것으로 서비스 간의 Loosely Coupled 를 가지는 것이 가능했지만, SaaS 나 IaaS 등의 외부 서비스를 이용하는 경우에도 필요가 있다.
Application 등에서 외부 endpoint를 설정하면 전환할 때 Application 쪽 설정 변경이 필요해지는데 ExternalName을 이용하여 DNS 의 전환은 ExternalName Service 의 변경만으로 가능하여 Kubernetes 상에서 가능해지고 외부와 Kubernetes Cluster 사이의 Loosely Coupled 상태도 유지 가능하다.
외부 서비스와 내부 서비스 간의 전환
ExternalName 이용으로 외부 서비스와의 Loosely Coupled 확보하고 외부 서비스와 Kubernetes 상의 Cluster 내부 서비스의 전환도 유연하게 가능하다
Ingress
L7 LoadBalancer 를 제공하는 Resource
Kubernetes 의 Network Policy resource 에 Ingress/Egress 설정항목과 관련 없음
Ingress 종류
아직 Beta Service 일 가능성
크게 구분하여 2가지
Cluster 외부의 Load Balancer를 이용한 Ingress
Cluster 내부에 Ingress 용의 Pod 을 생성하여 이용하는 Ingress
GKE
Nginx Ingress
Nghttpx Ingress
Cluster 외부의 Load Balancer 를 이용한 Ingress
GKE 같은 Cluster 외부의 Load Balancer를 이용한 Ingress 의 경우, Ingress rosource 를 만드는 것만으로 LoadBalancer 의 VIP 가 만들어져 이용하는 것이 가능
GCP의 GCLB (Google Cloud Load Balancer) 에서 수신한 traffic을 GCLB 에서 HTTPS 종단이나 path base routing 등을 수행하여 NodePort에 traffic을 전송하는 것으로 대상 Pod 에 도달
Cluster 내부에 Ingress 용의 Pod 을 생성하여 이용하는 Ingress
L7 역할을 할 Pod을 Cluster 내부에 생성하는 형태로 실현
Cluster 외부에서 접근 가능하도록 별도 Ingress 용 Pod에 LoadBalancer Service를 작성하는 등의 준비가 필요하다
Ingress 용의 Pod 이 HTTPS 종단이나 path base routing 등의 L7 역할을 하기 위해 Pod 의 replica 수의 auto scale 등도 고려할 필요가 있다.
LB와 일단 Nginx Pod 에 전송하여 Nginx 가 L7 역할을 하여 처리한 후 대상 Pod 에 전송한다.
NodePort 경유하지 않고 직접 Pod IP 로 전송
create Ingress resource
사전 준비가 필요
사전에 만들어진 Service를 Back-end로서 활용하여 전송을 하는 형태
Back-end 로 이용할 Service 는 NodePort 를 지정
`# sample-ingress-apps apiVersion: apps/v1 kind: Deployment metadata: name: sample-ingress-apps spec: replicas: 1 selector: matchLabels: ingress-app: sample template: metadata: labels: ingress-app: sample spec: containers: - name: nginx-container image: zembutsu/docker-sample-nginx:1.0 ports: - containerPort: 80 `
`# ingress service sample apiVersion: v1 kind: Service metadata: name: svc1 spec: type: NodePort ports: - name: "http-port" protocol: "TCP" port: 8888 targetPort: 80 selector: ingress-app: sample `
Ingress 로 HTTPS 를 이용하는 경���에는 인증서는 사전에 Secret 으로 등록해둘 필요가 있다.
Secret 은 인증서의 정보를 바탕으로 YAML 파일을 직접 만들거나 인증서 파일을 지정하여 만든다.
`# 인증서 작성 $ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/tls.key -out /tmp/tls.crt -subj "/CN=sample.example.com" # Secret 작성 (인증서 파일을 지정하는 경우) $ kubectl create secret tls tls-sample --key /tmp/tls.key --cert /tmp/tls.crt `
Ingress resource 는 L7 Load Balancer 이기에 특정 host 명에 대해 request path > Service back-end 의 pair 로 전송 rule 을 설정한다.
하나의 IP Address 로 복수의 Host 명을 가지는 것이 가능하다.
spec.rules[].http.paths[].backend.servicePort 에 설정하는 Port 는 Service 의 spec.ports[].port 를 지정
`# ingress_sample.yml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: sample-ingress annotations: ingress.kubernetes.io/rewrite-target: / spec: rules: - host: sample.example.com http: paths: - path: /path1 backend: serviceName: svc1 servicePort: 8888 backend: serviceName: svc1 servicePort: 8888 tls: - hosts: - sample.example.com secretName: tls-sample `
**Ingress resource & Ingress Controller**
Ingress resource = YAML file 에 등록된 API resource
Ingress Controller = Ingress resource 가 Kubernetes 에 등록되었을 때, 어떠한 처리를 수행하는 것
GCP 의 GCLB 를 조작하여 L7 LoadBalancer 설정을 하는 것이나,
Nginx 의 Config 를 변경하여 reload 하는 등
GKE 의 경우
GKE 의 경우, 기본으로 GKE 용 Ingress Controller 가 deploy 되어 있어 딱히 의식할 필요 없이 Ingress resource 마다 자동으로 IP endpoint 가 만들어진다.
Nginx Ingress 의 경우
Nginx Ingress 를 이용하는 경우에는 Nginx Ingress Controller 를 작성해야 한다.
Ingress Controller 자체가 L7 역할을 하는 Pod 이 되기도 하기에 Controller 라는 이름이지만 실제 처리도 수행한다.
GKE 와 같이 cluster 외부에서도 접근을 허용하기 위해서는 Nginx Ingress Controller 으로의 LoadBalancer Service (NodePort 등도 가능) 를 작성할 필요가 있다.
개별적으로 Service 를 만드는 것이기에 kubectl get ingress 등으로 endpoint IP Address 를 확인 불가 하기에 주의가 필요하다.
rule 에 매칭되지 않을 경우의 default 로 전송할 곳을 작성할 필요가 있으니 주의
실제로는 RBAC, resource 제한, health check 간격 등 세세한 설정해 둬야 할 수 있다.
nginx ingress 추천 설정
`# Nginx ingress를 이용하는 YAML sample apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend spec: replicas: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: containers: - name: default-http-backend image: gcr.io/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP ports: - containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: default-http-backend labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-ingress-controller spec: replicas: 1 selector: matchLabels: app: ingress-nginx template: metadata: labels: app: ingress-nginx spec: containers: - name: nginx-ingress-controller image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.14.0 args: - /nginx-ingress-controller - --default-backend-service=$(POD_NAMESPACE)/default-http-backend env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace ports: - name: http containerPort: 80 - name: https containerPort: 443 livenessProbe: httpGet: path: /healthz port: 10254 scheme: HTTP readinessProbe: httpGet: path: /healthz port: 10254 scheme: HTTP --- apiVersion: v1 kind: Service metadata: name: ingress-endpoint labels: app: ingress-nginx spec: type: LoadBalancer ports: - port: 80 targetPort: 8080 selector: app: ingress-nginx `
default back-end pod 이나 L7 처리할 Nginx Ingress Controller Pod 의 replica 수가 고정이면 traffic 이 늘었을 때 감당하지 못할 가능성도 있으니 Pod auto scaling 을 수행하는 Horizontal Pod Autoscaler (HPA) 의 이용도 검토해야 할 수 있음
deploy 한 Ingress Controller 는 cluster 상의 모든 Ingress resource 를 봐 버리기에 충돌할 가능성이 있다.
상세 사양
Ingress Class 를 이용하여 처리하는 대상 Ingress resource 를 분리하는 것이 가능
Ingress resource 에 Ingress Class Annotation 을 부여하여 Nginx Ingress Controller 에 대상으로 하는 Ingress Class를 설정하는 것으로 대상 분리 가능
Nginx Ingress Controller 의 기동 시 --ingress-class 옵션 부여
Ingress resource Annotation
/nginx-ingress-controller --ingress-class=system_a ...
kubernetes.io/ingress.class: "system_a"
정리
Kubernetes Service & Ingress
Service
Ingress
L4 Load Balancing
Cluster 내부 DNS 로 lookup
label 을 이용한 Pod Service Discovery
L7 Load Balancing
HTTPS 종단
path base routing
Kubernetes Service
ClusterIP : Kubernetes Cluster 내부 한정으로 통신 가능한 VIP
ExternalIP : 특정 Kubernetes Node 의 IP
NodePort : 모든 Kubernetes Node 의 모든 IP (0.0.0.0)
LoadBalancer : Cluster 외부에 제공되는 Load Balancer의 VIP
ExternalName : CNAME 을 사용한 Loosely Coupled
Headless : Pod 의 IP 를 사용한 DNS Round Robin
Kubernetes Ingress
Cluster 외부의 Load Balancer 를 이용한 Ingress : GKE
Cluster 내부의 Ingress 용 Pod 을 이용한 Ingress : Nginx Ingress, Nghttpx Ingress
Kubernetes Config & Storage resource
container 에 대한 설정 파일, 암호 등의 기밀 정보나 Persistent Volume 등에 관한 resource
3 종류
Secret
ConfigMap
PersistentVolumeClaim
환경변수 이용
Kubernetes 에서는 개별 container에 대한 설정의 내용은 환경변수나 파일이 포함된 영역을 mount 해서 넘기는 것이 일반적이다
환경변수를 넘기려면 pod template 에 env 혹은 envFrom 을 지정
5개의 정보 source로부터 환경변수를 심는 것이 가능
정적설정
Pod 정보
Container 정보
Secret resource 의 기밀 정보
ConfigMap resource 의 Key-Value 값
정적설정
spec.containers[].env 에 정적 값을 정의
`apiVersion: v1 kind: Pod metadata: name: sample-env labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 env: - name: MAX_CONNECTION value: "100" `
Pod 정보
Pod 이 속한 Node 나, Pod 의 IP Address, 기동시간 등의 Pod 에 관련된 정보는 fieldRef를 사용하여 참조할 수 있다.
참조 가능한 값은 kubectl get pods -o yaml 등으로 확인할 수 있다.
등록한 YAML file 의 정보와 별도로 IP, host 정도 등도 추가되었다.
`# 동작 중인 Pod 의 정보 확인 $ kubectl get pod nginx-pod -o yaml ... spec: nodeName: gke-k8s-... ... `
`# env-pod-sample.yml # set Kubernetes Node's name to env K8S_NODE apiVersion: v1 kind: Pod metadata: name: sample-env-pod labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 env: - name: K8S_NODE valueFrom: fieldRef: fieldPath: spec.nodeName `
**Container 정보**
Container 에 관련한 정보는 resourceFieldRef를 사용하여 참조할 수 있다.
Pod 에는 복수 container 의 정보가 포함되어 있어서 각 container에 설정 가능한 값에 대해서는 fieldRef로는 참조할 수 없는 것에 주의.
참조 가능한 값에 관해선 kubectl get pods -o yaml 등오로 확인 가능
`# env-container-sample.yml #set CPU Requests/Limits to ENV apiVersion: v1 kind: Pod metadata: name: sample-env-container labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 env: - name: CPU_REQUEST valueFrom: resourceFieldRef: containerName: nginx-container resource: requests.cpu - name: CPU_LIMIT valueFrom: resourceFieldRef: containerName: nginx-container resource: limits.cpu `
**Secret resource 기밀 정보**
기밀 정보는 별도 Secret resource 를 만들어 환경변수로 참조 시키는 것을 추천
ConfigMap resource 에서 Key-Value 값
단순한 Key-Value 값이나 설정 파일 등은 ConfigMap 으로 관리하는 것이 가능
일괄 변경이나 중복이 많이 많은 경우 ConfigMap 을 사용하는 것이 유용할 수 있음
환경변수 이용 시 주의점
`# env fail sample apiVersion: v1 kind: Pod metadata: name: sample-fail-env labels: app: sample-app spec: containers: - name: nginx-container image: nginx:1.12 command: ["echo"] args: ["${TESTENV}", "${HOSTNAME}"] env: - name: TESTENV value: "100" `
command 나 args 에 환경변수를 이용하기 위해선 ${} 가 아닌 $()를 사용해야 한다
command 나 args 에서 참조가능한 환경 변수는 해당 Pod template 내부에서 정의된 환경 변수에 제한된다. (위 예제에서는 TESTENV 값만 참조 가능)
OS 등에서만 참조할 수 있는 환경 변수를 이용하기 위해서는 script 등을 이용하여 실행하도록 해야 한다.
Secret
DataBase 등을 사용할 때 필요한 유저, 암호 등의 인증에 필요한 경우 이용할 수 있는 방식
유저명과 암호를 별도 resource 로 정의해두고 Pod 에서 이를 불러들여 사용하기 위한 resource
Secret이 정의된 YAML Manifest를 암호화하는 ::kubesec:: 이란 OSS 존재
gpg, Google Cloud KMS, AWS KMS 등을 이용하여 간단하게 data.* 부분만 암호화할 수 있어 외부로 공개되어도 문제 소지가 적다.
Docker build 시 Container Image 에 첨부
환경변수나 실행 인수 등에 포함시켜 Container Image 를 build.
기밀 정보를 포함하고 있는 만큼 해당 Image 를 외부에 공개하거나 배포하기 곤란하며 인증 정보가 변경된다면 Image 를 다시 build 해야 하는 등 불편
Pod 이나 Deployment YAML Manifest 에 첨부
이 역시 YAML 이 외부에 알려져서는 안되고 복수 Application 에서 동일한 정보를 사용하게 된다면 여기저기 퍼지게 되는 것이라 이 역시 문제
Secret 분류
Generic (type: Opaque)
TLS (type: kubernetes.io/tls)
Docker Registry (type: kubernetes.io/dockerconfigjson)
Service Account (type: kubernetes.io/service-account-token)
Generic (type: Opaque)
보통 사용하는 암호 등에 이용
작성 방법
file 을 이용 (--from-file)
yaml
kubectl 이용하어 직접 생성 (--from-literal)
envfile
Secret 에서는 복수의 Key-Value 값이 보존된다.
db-auth 라는 이름의 Secret 에는 username, password 라는 Key가 있고 그에 해당하는 Value 값이 존재할 수 있다.
복수의 DB를 사용하는 경우, Secret 이름이 겹치지 않게 정의하거나 시스템별로 Namespace를 분할하는 등의 작업이 필요
from File
--from-file 옵션을 사용하여 파일을 지정하여 사용한다
파일명이 곧 Key 가 되기에 파일명에 포함된 확장자 등은 제거하는게 좋을 수 있다.
확장자를 제거하기 싫다면, --from-file=username=username.txt 처럼 사용할 수도 있다.
파일에 개행문��(\n)가 들어가지 않도록 echo -n 을 사용하는 등으로 주의해야 한다.
`# source 파일 생성 $ echo -n "user" > ./username $ echo -n "password" > ./password # Secret 생성 $ kubectl create secret generic sample-db-auth --from-file=./username --from-file=./password # json 형식으로 base64 형식으로 encode 된 Secret data 부분을 확인 $ kubectl get secret sample-db-auth -o json | jq .data # base64 형식으로 encode 된 내용을 평문으로 확인 $ kubectl get secret sample-db-auth -o json | jq -r .data.username | base64 -d `
from yaml
`apiVersion: v1 kind: Secret metadata: name: sample-db-auth type: Opaque data: username: dXNlcgo= password: cGFzc3dvcmQK `
using kubectl (--from-literal)
—from-literal 옵션 사용
`$ kubectl create secret generic sample-db-auth --from-literal=username=user --from-literal=password=password `
from envfile
일괄적으로 처리할 때 이용가능하며 Docker 에서 —env-file 옵션을 사용하여 container를 사용하고 있었다면 그대로 Secret 에 이식하는 것도 가능하다.
`username=user password=password $ kubectl create secret generic sample-db-auth --from-env-file ./env_secret `
TLS (type: kubernetes.io/tls)
Ingress 등에서 참조 가능한 TLS 용 Secret
주로 인증서 등을 이용하기 위해 사용하며 일반적으로 파일을 이용하여 이용한다.
--key 와 —cert 로 비밀키와 인증서를 지정한다.
`# create certificate $ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/tls.key -out /tmp/tls.crt - subj "/CN=sample1.example.com" # create TLS Secret $ kubectl create secret tls tls-sample --key /tmp/tls.key --cert /tmp/tls.crt `
Docker Registry (type: kubernetes.io/dockerconfigjson)
Docker Registry 인증 정보용
using kubectl
kubectl 을 이용할 경우 Registry 서버와 인증정보를 인수로 지정한다.
`$ kubectl create secret docker-registry sample-registry-auth \ --docker-server=REGISTRY_SERVER \ --docker-username=REGISTRY_USER \ --docker-password=REGISTRY_USER_PASSWORD \ --docker-email=REGISTRY_USER_EMAIL # get $ kubectl get secret -o json sample-registry-auth | jq .data `
Secret 을 이용한 image 획득
인증이 필요한 Docker Registry 나 Docker Hub 의 Private repository 의 Image 를 가져오기 위해서 Secret 을 사전에 만들어놓고 Pod 의 spec.imagePullSecrets 에 docker-registry 타입의 Secret 을 지정한다.
`# secret-pull_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-pod spec: containers: - name: secret-image-container image: REGISTRY_NAME/secret-image:latest imagePullSecrets: - name: sample-registry-auth `
Service Account
수동으로 만드는 것은 아니지만 Pod 에 Service Account 의 Token 을 mount 하기 위함
Secret 이용
Secret 을 container에서 이용할 경우, 크게 나눠서 2가지 패턴
환경변수
Volume으로 mount
Secret 의 특정 Key 한정
Secret 의 모든 Key
Secret의 특정 Key 한정
Secret 의 모든 키
환경변수
환경변수로 넘길 경우, 특정 Key만을 넘기 던가, Secret 전체를 넘기던가.
특정 key 만을 넘길 경우, spec.containers[].env 의 valueFrom.secretKeyRef 를 사용하여 넘길 키를 지정
`# secret_single_env_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-secret-single-env spec: containers: - name: secret-container image: nginx:1.12 env: - name: DB_USERNAME valueFrom: secretKeyRef: name: sample-db-auth key: username `
env 로 1개씩 정의가 가능하여 환경변수 명을 지정 가능하다.
Secret 전체를 넘길 경우
`# secret_multi_env_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-secret-multi-env spec: containers: - name: secret-container image: nginx:1.12 envFrom: - secretRef: name: sample-db-auth `
Key 를 일일이 지정하지 않아도 되어 간결하지만 Secret 에 저장되어 있는 값을 Pod Template 로는 파악하기 힘들다. **Volume Mount**
특정 Key 만을 넘길 경우, spec.volumes[] 의 secret.item[] 을 사용하여 지정한다.
`# secret_single_volume_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-secret-single-volume spec: containers: - name: secret-container image: nginx:1.12 volumeMounts: - name: config-volume mountPath: /config volumes: - name: config-volume secret: secretName: sample-db-auth items: - key: username path: username.txt `
mount 할 파일을 1개씩 정의할 수 있어 파일명을 지정 가능하다. <pre>`$ kubectl exec -it sample-secret-single-volume cat /config/username.txt `</pre>
Secret 전체를 변수로 넘길 경우
`# secret_multi_volume_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-secret-multi-volume spec: containers: - name: secret-container image: nginx:1.12 volumeMounts: - name: config-volume mountPath: /config volumes: - name: config-volume secret: secretName: sample-db-auth `
`$ kubectl exec -it sample-secret-multi-volume ls /config `
**동적 Secret 갱신** Volume Mount 으로 Secret 을 이용할 경우 일정 기간 주기 (kubelet 의 Sync Loop 의 타이밍) 로 kube-apiserver 에 변경을 확인해서 변경이 있을 경우 갱신한다. 기본적으로 SyncLoop의 간격은 60초로 설정되어 있으나 kuebelet 의 옵션 `—sync-frequency` 로 변경 가능하다. (이는 환경변수를 이요하는 경우에는 이용 불가 하다.) 이 경우 Volume 에 마운트 된 파일의 값이 바뀌어도 Pod 이 재생성 되는 것이 아니어서 끊기는 것을 걱정할 필요도 없다. Secret 에서 삭제된 것 역시 같이 삭제된다. ## ConfigMap ConfigMap 은 설정 정보 등을 Key-Value 로 보존할 수 있는 데이터를 저장하기 위한 resource. Key-Value 라고 해도 nginx.conf 나 httpd.conf 와 같은 설정 파일 그 자체도 보존가능하다. **create ConfigMap** Generic type Secret 과 거의 동일한 방법.
3가지 방법
파일을 사용 (—from-file)
yaml 파일을 사용
kubectl로 직접 생성 (—from-literal)
ConfigMap 에는 복수의 Key-Value 값이 들어간다. nginx.conf 전체를 ConfigMap 안에 넣거나 nginx.conf 의 설정 parameter만 넣어도 된다.
파일을 사용
파일을 사용할 경우. —from-file 을 지정한다. 보통 파일명이 그대로 Key 로 사용되며 Key명을 변경하고 싶을 경우, —from-file=nginx.conf=sample-nginx.conf 와 같은 형식으로 지정한다.
`# create ConfigMap $ kubectl create configmap sample-configmap --from-file=./nginx.conf # 확인 $ kubectl get configmap sample-configmap -o json | jq .data # describe $ kubectl describe configmap sample-configmap `
YAML 파일을 사용
Value 값이 길 경우, Key: | 처럼 정의
`apiVersion: v1 kind: ConfigMap metadata: name: sample-configmap data: thread: "16" connection.max: "100" connection.min: "10" sample.properties: | property.1=value-1 property.2=value-2 property.3=value-3 nginx.conf: | user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; ... `
kubectl 을 사용 (—from-literal)
`$ kubectl create configmap web-config \ --from-literal=connection.max=100 \ --from-literal=connection.min=10 `
ConfigMap 이용
환경변수
Volume mount
특정 Key
모든 Key
특정 Key
모든 Key
환경변수
특정 Key만 넘길 경우, spec.containers[].env 의 valueFrom.configMapKeyRef 를 사용
`# configmap_single_env_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-configmap-single-env spec: containers: - name: configmap-container image: nginx:1.12 env: - name: CONNECTION_MAX valueFrom: configMapKeyRef: name: sample-configmap key: connection.max `
env 한개씩 정의가 가능해서 환경변수명을 지정 가능
모든 Key를 넘길 경우, 한개씩 정의할 필요가 없어 YAML 이 간략해지지만 YAML 만으로는 어떤 값이 있는지 판단하긴 힘들다
`# configmap_multi_env_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-configmap-multi-env spec: containers: - name: configmap-container imgae: nginx:1.12 envFrom: - configMapRef: name: sample-configmap `
환경변수를 이용하는 경우, 다음과 같은 패턴은 표현할 수 없어 전달되지 않는다.
. 이 포함되는 경우
개행이 포함되는 경우. (Key. | 로 정의된 경우)
Volume Mount
특정 Key 만 넘길 경우, spec.volumes[] 의 configMap.items[] 를 사용한다.
`# configmap_single_volume_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-configmap-single-volume spec: containers: - name: configmap-container image: nginx:1.12 volumeMounts: - name: config-volume mountPath: /config volumes: - name: config-volume configMap: name: sample-configmap items: - key: nginx.conf path: nginx-sample.conf `
mount 할 파일을 하나씩 정의하므로 파일명을 지정 가능하다. <pre>`$ kubectl exec -it sample-configmap-single-volume cat /config/nginx-sample.conf `</pre>
모든 Key 를 넘길 경우, YAML 이 간결해지지만 YAML 만으로는 어떤 값이 있는지 판단하기 힘들다.
`# configmap_multi_volume_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-configmap-multi-volume spec: containers: - name: configmap-container image: nginx:1.12 volumeMounts: - name: config-volume mountPath: /config volumes: - name: config-volume configMap: name: sample-configmap `
`$ kubectl exec -it sample-configmap-multi-volume ls /config `
**동적 ConfigMap 갱신** volume mount 를 이용하는 경우, 일정시간 주기 (kubelet 의 Sync Loop 타이밍)로 kube-apiserver 에 변경을 확인해서, 변경이 있으면 갱신한다. SyncLoop 의 기본값은 60초로 설정되어 있으나 이를 변경하고 싶을 경우, kubelet 의 `—sync-frequency` 를 사용하여 설정한다. (환경 변수의 경우 동적 갱신이 불가) ## Volume 과 PersistentVolume, PersistentVolumeClaim 의 차이 Volume 은 기존의 Volume (host 영역, NFS, Ceph, GCP Volume) 등을 YAML Manifest 에 직접 지정하여 이용 가능하게 하는 것이다. 그래서 이용자가 신규로 Volume 을 작성하거나, 기존의 Volume 을 삭제하는 등의 조작이 불가능하다. 그리고 YAML Manifest 로 Volume resource 를 만드는 등의 처리도 불가능하다. PersistentVolume은 외부의 Persistent 한 Volume을 제공하는 시스템과 연계하여 신규 Volume 을 만들거나 기존의 Volume 을 삭제하는 것이 가능하다. 구체적으로는 YAML Manifest 등을 통해 Persistent Volume resource 를 별도 만드는 형식이다. PersistentVolume 의 plug-in 에서는 Volume 의 생성과 삭제와 같은 life cycle 을 처리하는 것이 가능하지만 Volume 의 plug-in 의 경우, 이미 있는 Volume 을 사용하는 것만 가능하다. PersistentVolumeClaim 은 PersistentVolume resource 에서 assign 하기 위한 resource. PersistentVolume은 cluster에 volume 을 등록하기만 하는거라, 실제로 Pod 에서 이용하기 위해서는 PersistentVolumeClaim 을 정의해야 한다. Dynamic Provisioning 기능을 이용할 경우, PersistentVolumeClaim을 이용하는 시점에 Persistent Volume 을 동적으로 생성하는 것이 가능해서 순서가 반대가 될 수 있다. ## Volume [Volume Plug-in](https://kubernetes.io/docs/concepts/storage/volumes/) PersistentVolume과 달리 Pod 에 대해 정적으로 영역을 지정하기 위한 형태가 되기 때문에 충돌(경쟁)에 주의 **EmptyDir**
Pod 의 일시적인 디스크 영역으로 이용 가능
Pod 이 Terminate 되면 삭제됨
`# emptydir-sample.yml apiVersion: v1 kind: Pod metadata: name: sample-emptydir spec: containers: - image: nginx:1.12 name: nginx-container volumeMounts: - mountPath: /cache name: cache-volume volumes: - name: cache-volume emptyDir: {} `
**HostPath**
Kubernetes Node 상의 영역을 container 에 mapping 하기 위한 plug-in.
type
Directory
DirectoryOrCreate
File
Socket
BlockDevice
`# hostpath-sample.yml apiVersion: v1 kind: Pod metadata: name: sample-hostpath spec: containers: - image: nginx:1.12 name: nginx-container volumeMounts: - mountPath: /srv name: hostpath-sample volumes: - name: hostpath-sample hostPath: path: /data type: DirectoryOrCreate `
## PersistentVolume (PV)
Volume 이 Pod 정의에 포함되었다면 PersistentVolume은 resource 로 별도로 작성
엄밀히 말하면 Config&Storage resource 보다 Cluster resource.
PersistentVolume 종류
기본적으로 network 를 이용해 disk attach 한다.
Persistent Volumes - Kubernetes
GCE Persistent Disk
AWS Elastic Block Store
NFS
iSCSI
Ceph
OpenStack Cinder
GlusterFS
create PersistentVolume
label
용량
access mode
reclaim policy
mount option
storage class
setting for each PersistentVolume
`# pv_sample.yml apiVersion: v1 kind: PersistentVolume metadata: name: sample-pv labels: type: nfs envrionment: stg spec: capacity: storage: 10G accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: slow mountOptions: - hard nfs: server: xxx.xxx.xxx.xxx path: /nfs/sample `
`$ kubectl get pv `
**Label** Dynamic Provisioining 을 사용하지 않고 PersistentVolume을 사용하는 경우, 종류가 알 수 없게 되기 쉬우니 type, environment, speed 등의 label 을 붙이는걸 추천한다. Label 을 붙이면 PersistentVolumeClaim 에서 Volume 의 Label 을 지정가능하여 scheduling 을 유난하게 수행할 수 있다. **용량** Dynamic Provisioning 을 이용할 수 없을 경우, 무작정 용량을 크게 잡아선 안 된다. 요구한 용량과 가장 가까운 용량을 가진 storage 가 사용되기 때문. **access mode**
ReadWriteOnce (RWO) : 단일 node 에서 Read / Write
ReadOnlyMany (ROX) : 단일 node 에서 Write, 복수 노드에서 Read
ReadWriteMany (RWX) : 복수 node 에서 Read / Write
Persistent Volumes - Kubernetes
Reclaim Policy
Persistent Volume 사용이 끝난 후 처리방법
Retain
Recycle
Delete
data 를 삭제하지 않고 보존
다른 PersistentVolumeClaim 으로 재 mount 되는 일은 없다
data 삭제 (rm -rf ./*), 재이용가능한 상태로
다른 PersistentVolumeClaim 에서 재 mount 가능
PersistentVolume 삭제
주로 외부 volume 을 사용할 때 사용
Mount Options
PersistentVolume 의 종류에 따라 상이. 확인.
Storage Class
Dynamic Provisioning 의 경우 사용자가 PersistentVolumeClaim 을 사용하여 PersistentVolume 을 요구할 때 원하는 디스크를 지정하기 위해 사용.
Storage Class 선택 = 외부 Volume 종류 선택
`#storageclass_sample.yml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: sample-storageclass parameters: availability: test-zone-la type: scaleio provisioner: kubernetes.io/cinder `
PersistentVolume plug-in 별 설정
실제로는 종류에 따라 제각가이라 확인 필요
PersistentVolumeClaim (PVC)
PersistemtVolumeClaim 에 지정된 조건을 가지고 요구하여 Scheduler 는 현재 보유하고 있는 PersistentVolume 에서 적합한 Volume 을 할당
PersistentVolumeClaim 설정
label selector
capacity
access mode
Storage Class
PVC 에서 요구하는 용량이 PV 용량보다 작으면 할당된다. 8기가를 요구 했는데 딱 맞는게 없고 그 보다 큰 20 기가가 있으면 20기가 를 할당.
NFS 의 경우 Quota 가 걸려 있지 않아 PV 의 용량이 사실상 무시된다.
create PersistentVolumeClaim
`# pvc_sample.yml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: sample-pvc spec: selector: matchLabels: type: "nfs" matchExpressions: - {key: environment, operator: In, values: [stg]} accessModes: - ReadWriteOnce resources: requests: storage: 4Gi `
`$ kubectl get pvc $ kubectl get pv `
PVC 가 PV 확보에 실패하면 pending 상태가 유지된다.
Retain Policy 를 사용하고 있을 경우, Pod 이 종료되면 Bound 상태에서 Released 상태로 바뀐다. 이 Released 상태가 된 PV 는 PVC 가 재할당하지 않는다.
use in Pod
spec.volumes 의 persistentVolumeClaim.claimName 을 사용
`# pvc_pod_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-pvc-pod spec: containers: - name: nginx-container image: nginx ports: - containerPort: 80 name: "http" volumeMounts: - mountPath: "/usr/share/nginx/html" name: nginx-pvc volumes: - name: nginx-pvc persistentVolumeClaim: className: sample-pvc `
Dynamic Provisioning
동적으로 PV 를 만들기 때문에 용량을 효율적으로 사용하는 것이 가능하고 사전에 PV 를 만들어 둘 필요가 없다.
많은 Provisioner 가 ReadWriteOnce 밖에 지원하지 않는 경우가 많다.
`# Storage Class # storageclass_sample.yml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: sample-storageclass parameters: type: pd-standard provisioner: kubernetes.io/gce-pd reclaimPolicy: Delete `
`# PVC # pvc_provisioner_sample.yml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: sample-pvc-provisioner annotations: volume.beta.kubernetes.io/storage-class: sample-storageclass spec: accessModes: - ReadWriteOnce resources: requests: storage: 3Gi `
`# Pod # pvc_provisioner_pod_sample.yml apiVersion: v1 kind: Pod metadata: name: sample-pvc-provisioner-pod spec: containers: - name: nginx-container image: nginx prots: - containerPort: 80 name: "http" volumeMounts: - mountPath: "/usr/share/nginx/html" name: nginx-pvc volumes: - name: nginx-pvc persistentVolumeClaim: claimName: sample-pvc-provisioner `
`$ kubectl get pv `
PersistentVolumeClaim in StatefulSet
StatefulSet 에서는 PersistentVolumeClaim 을 이용할 경우가 많고spec.volumeClaimTemplate 을 이용하면 별도로 PVC 를 정의할 필요 없다.
`apiVersion: apps/v1beta1 kind: StatefulSet metadata: ... spec: template: spec: containers: - name: sample-pvct image: nginx:1.12 volumeMounts: - name: pvc-template-volume mountPath: /tmp volumeClaimTemplates: - matadata: name: pvc-template-volume spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi storageClassName: "sample-storageclass"
0 notes
Link
CCA 131 - Cloudera Certified Hadoop and Spark Administrator ##FreeCourse ##UdemyFrancais #Administrator #CCA #CERTIFIED #Cloudera #Hadoop #Spark CCA 131 - Cloudera Certified Hadoop and Spark Administrator CCA 131 is certification exam conducted by the leading Big Data Vendor, Cloudera. This online proctored exam is scenario based which means it is very hands on. You will be provided with multi-node cluster and need to take care of given tasks. To prepare the certification one need to have hands on exposure in building and managing the clusters. However, with limited infrastructure it is difficult to practice in a laptop. We understand that problem and built the course using Google Cloud Platform where you can get credit up to $300 till offer last and use it to get hands on exposure in building and managing Big Data Clusters using CDH. Required Skills Install - Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects. Set up a local CDH repository Perform OS-level configuration for Hadoop installation Install Cloudera Manager server and agents Install CDH using Cloudera Manager Add a new node to an existing cluster Add a service using Cloudera Manager Configure - Perform basic and advanced configuration needed to effectively administer a Hadoop cluster Configure a service using Cloudera Manager Create an HDFS user's home directory Configure NameNode HA Configure ResourceManager HA Configure proxy for Hiveserver2/Impala Manage - Maintain and modify the cluster to support day-to-day operations in the enterprise Rebalance the cluster Set up alerting for excessive disk fill Define and install a rack topology script Install new type of I/O compression library in cluster Revise YARN resource assignment based on user feedback Commission/decommission a node Secure - Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practices Configure HDFS ACLs Install and configure Sentry Configure Hue user authorization and authentication Enable/configure log and query redaction Create encrypted zones in HDFS Test - Benchmark the cluster operational metrics, test system configuration for operation and efficiency Execute file system commands via HTTPFS Efficiently copy data within a cluster/between clusters Create/restore a snapshot of an HDFS directory Get/set ACLs for a file or directory structure Benchmark the cluster (I/O, CPU, network) Troubleshoot - Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios Resolve errors/warnings in Cloudera Manager Resolve performance problems/errors in cluster operation Determine reason for application failure Configure the Fair Scheduler to resolve application delays Our Approach You will start with creating Cloudera QuickStart VM (in case you have laptop with 16 GB RAM with Quad Core). This will facilitate you to get comfortable with Cloudera Manager. You will be able to sign up for GCP and avail credit up to $300 while offer lasts. Credits are valid up to year. You will then understand brief overview about GCP and provision 7 to 8 Virtual Machines using templates. You will also attaching external hard drive to configure for HDFS later. Once servers are provisioned, you will go ahead and set up Ansible for Server Automation. You will take care of local repository for Cloudera Manager and Cloudera Distribution of Hadoop using Packages. You will then setup Cloudera Manager with custom database and then Cloudera Distribution of Hadoop using Wizard that comes as part of Cloudera Manager. As part of setting up of Cloudera Distribution of Hadoop you will setup HDFS, learn HDFS Commands, Setup YARN, Configure HDFS and YARN High Availability, Understand about Schedulers, Setup Spark, Transition to Parcels, Setup Hive and Impala, Setup HBase and Kafka etc. Once all the services are configured, we will revise for exam by mapping with required skills of the exam. Who this course is for: System Administrators who want to understand Big Data eco system and setup clusters Experienced Big Data Administrators who want to prepare for the certification exam Entry level professionals who want to learn basics and Setup Big Data Clusters 👉 Activate Udemy Coupon 👈 Free Tutorials Udemy Review Real Discount Udemy Free Courses Udemy Coupon Udemy Francais Coupon Udemy gratuit Coursera and Edx ELearningFree Course Free Online Training Udemy Udemy Free Coupons Udemy Free Discount Coupons Udemy Online Course Udemy Online Training 100% FREE Udemy Discount Coupons https://www.couponudemy.com/blog/cca-131-cloudera-certified-hadoop-and-spark-administrator/
0 notes
ncmagroup · 4 years
Text
By Isaac Kohen
  Due to COVID-19, companies have found themselves in the middle of the world’s largest work-from-home experiment. Many hail remote work as a blessing, allowing employees to continue working while practicing social distancing during this uncertain time
Of course, even before coronavirus grabbed our attention, many companies were already adapting various telecommuting and remote work policies. A 2019 study by Owl Labs found that 16% of global companies were fully remote, and 40% were hybrid (companies who offer both remote and in-office options). There are good reasons why companies have been embracing the remote work culture. From boosting productivity, reducing the cost of operations, and enabling access to a larger talent pool, remote work has many benefits.
However, for all its advantages, remote work also presents many unique challenges. Companies must establish communication norms, address diminished productivity due to lack of supervision, and create systems for managing accountability, tracking projects, executing payroll, and prioritizing cybersecurity. Taken together, these hurdles pose a significant hindrance to fruitful remote work. Fortunately, with the right strategy and tools, you can effectively identify and address these issues to fully reap the benefits of a remote team.
For those enacting a rapid transition to digital work or even for veteran leaders of decentralized teams, here are five best practices to ensure that your operation runs smoothly.
#1 Adopt Remote Security Measures
Ensuring security, privacy, and data protection in a fully remote setup will be difficult. Traditional intrusion prevention systems, firewalls, and sandboxes are not enough now that employees are no longer under your corporate security perimeter. While VPNs are being advertised as a solution, they are not the quick fix that many companies expect them to be.
One workaround is an endpoint-based data loss prevention (eDLP) solution. It can monitor and protect your employees’ workstations instead of just the servers and nodes on your network. This has the benefit of preventing data leaks by mitigating risk at the user level, where the majority of cybersecurity threats occur.
Another approach that companies can take is to utilize virtual machines on Windows, VMware Horizon, AWS/Azure/GCP, or set up a Terminal Server for their remote team. With this arrangement, employees can then login to the server using RDP while continuing to protect the company’s sensitive network and repositories. This is particularly useful if you have third-party contractors or other external users you need to support.
#2 Implement a Communication Policy
Frequent and fluid communications are keys to a successful remote team. Zoom, Slack, Microsoft Teams, Google Hangout, Telegram, 8×8, WebEx, and Jabber all are useful tools. Many of them are offering free services for a limited time. In many cases, cell phone providers are also extending or removing the data cap.
However, due to increased demand, you may experience some limitations and disruptions. For example, Zoom recently removed the dial-in by phone audio conferencing capabilities from its Basic plan. Most recently, Microsoft Teams went down for over two hours because it was overloaded with activity. It’s a good idea to have a backup plan in case your default service fails.
Regardless of your brand preference, select one that has end-to-end encryption. If you want better protection, you can adapt a Data Loss Prevention (DLP), Content Monitoring and Filtering (CMF), or Extrusion Prevention System (EPS) solution. Many of these have real-time monitoring for email and IM/chat conversations to prevent sensitive data exfiltration over your communication channels.
#3 Utilize Team Collaboration Tools
Office 365, Google Apps, and other collaboration suites have built-in tools that allow employees to work together seamlessly. You can also use cloud drives to enable document sharing. When doing so, don’t forget to implement access control policies so that sensitive documents don’t get into the wrong hands.
Use project management tools for assigning team responsibilities, defining project priorities and tasks, and providing clear directions. This is critical for remote team collaborations. The right tools provide employee clock-ins, create and track projects and tasks in real-time, integrate with your support systems/PM systems, and many other useful features to improve team collaboration and engagement.
#4 Monitor Employee Engagement and Productivity
Remote workers can be highly productive. However, flexible and “always-on” hours can lead to a burnout culture that neither supports employee well being nor promotes company priorities. Therefore, establishing a consistent workflow that measures team performance is critical to ensuring that your team stays productive and doesn’t suffer alienation, burnout, or app-overload.
Specifically, a user and entity behavior analytics (UEBA) software can be utilized to track employee behavior for signs of stress or other abnormalities. Of course, companies should implement this software carefully because many employee monitoring and time tracking software solutions allow you to record screens. This can be demotivating and hamper productivity. Instead, focus on activity and outcomes to support both employees and employers.
#5 Manage Your Remote Team with (Special) Care
The truth is, many managers are not trained to manage remote employees. After all, traditional management practices focus on personalized interactions, body language, and other soft skills that do not translate well in a virtual world.
Remote employees also have unique needs that require special attention. For example, a survey conducted by Buffer found that remote workers struggle with unplugging from work, loneliness, and collaboration/communications. Make sure your managers are mindful of these social elements. It seems counterintuitive, but technology can help address many of these “human” challenges. Conducting daily one-to-ones, using video conferencing, organizing “shenanigans,” “watercooler,” or casual team time, ideas generation sessions (for example, using Mural for remote brainstorming and engagement), flexible scheduling – all are useful tips. The key is to invest in the extra effort and pay attention to your employees.
While painful, this could be the watershed moment for the work-from-home movement. By the time we reach the end of the COVID-19 pandemic, remote work will likely be even more normative than before. Companies should prepare now with a full understanding of the remote workflow, security, productivity, and monitoring tools so that they can manage decentralized teams at scale – for the future of remote work is already here.
  Go to our website:   www.ncmalliance.com
5 Remote Working Best Practices and Tips in the Era of the Coronavirus Pandemic By Isaac Kohen By Isaac Kohen Due to COVID-19, companies have found themselves in the middle of the world’s largest work-from-home experiment…
0 notes