Tumgik
#onnx
govindhtech · 3 months
Text
How ONNX Runtime is Evolving AI in Microsoft with Intel
Tumblr media
With the goal of bringing AI features to devices, the Microsoft Office team has been working with Intel and ONNX Runtime for over five years to integrate AI capabilities into their array of productivity products. The extension of AI inference deployment from servers to Windows PCs enhances responsiveness, preserves data locally to protect privacy, and increases the versatility of AI tooling by removing the requirement for an internet connection. These advancements keep powering Office features like neural grammar checker, ink form identification, and text prediction.
What is ONNX Runtime
As a result of their extensive involvement and more than two decades of cooperation, Intel and Microsoft are working more quickly to integrate AI features into Microsoft Office for Windows platforms. The ONNX Runtime, which enables machine learning models to scale across various hardware configurations and operating systems, is partially responsible for this accomplishment. The ONNX runtime is continuously refined by Microsoft, Intel, and the open-source community. When used in this way, it enhances the efficiency of Microsoft Office AI models running on Intel platforms.
AI Generative
With ONNX Runtime, you can incorporate the power of large language models (LLMs) and generative artificial intelligence (AI) into your apps and services. State-of-the-art models for image synthesis, text generation, and other tasks can be used regardless of the language you develop in or the platform you need to run on.
ONNX Runtime Web
With a standard implementation, ONNX Runtime Web enables cross-platform portability for JavaScript developers to execute and apply machine learning models in browsers. Due to the elimination of the need to install extra libraries and drivers, this can streamline the distribution process.
ONNX Runtime Java
Using the same API as cloud-based inferencing, ONNX Runtime Mobile runs models on mobile devices. Swift, Objective-C, Java, Kotlin, JavaScript, C, and C++ developers can integrate AI to Android, iOS, react-native, and MAUI/Xamarin applications by using their preferred mobile language and development environment.
ONNX Runtime Optimization
Inference models from various source frameworks (PyTorch, Hugging Face, TensorFlow) may be efficiently solved by ONNX Runtime on various hardware and software stacks. In addition to supporting APIs in many languages (including Python, C++, C#, C, Java, and more), ONNX Runtime Inference leverages hardware accelerators and functions with web browsers, cloud servers, and edge and mobile devices.
Ensuring optimal on-device AI user experience necessitates ongoing hardware and software optimization, coordinated by seasoned AI-versed experts. The most recent ONNX Runtime capabilities are regularly added to Microsoft Office’s AI engine, guaranteeing optimal performance and seamless AI model execution on client devices.
Intel and Microsoft Office have used quantization, an accuracy-preserving technique for optimizing individual AI models to employ smaller datatypes. “Microsoft Office’s partnership with Intel on numerous inference projects has achieved notable reductions in memory consumption, enhanced performance, and increased parallelization all while maintaining accuracy by continuing to focus on our customers,” stated Joshua Burkholder, Principal Software Engineer of Microsoft’s Office AI Platform.
With the help of Intel’s DL Boost, a collection of specialized hardware instruction sets, this method reduces the on-device memory footprint, which in turn reduces latency. The ONNX Runtime has been tuned to work with Intel’s hybrid CPU design, which combines efficiency and performance cores. With Intel Thread Director, this is further enhanced by utilising machine learning to schedule activities on the appropriate core, guaranteeing that they cooperate to maximise performance-per-watt.
Furthermore, on-device AI support for Office web-based experiences is being provided by Intel and Microsoft in partnership. The ONNX Runtime Web makes this feasible by enabling AI feature support directly in web applications, like Microsoft Designer.
Balancing Cloud and On-device
With the advent of AI PCs, particularly those featuring the latest Intel Core Ultra processor, more workloads are being able to move from cloud-based systems to client devices. Combining CPU ,  GPU , and NPU , Intel Core Ultra processors offer complementary AI compute capabilities that, when combined with model and software optimizations, can be leveraged to provide optimal user experience.
Even while the AI PC opens up new possibilities for executing AI activities on client devices, it is necessary to assess each model separately to ascertain whether or not running locally makes sense. AI computation may take on a hybrid form in the future, with a large number of models running on client devices and additional cloud computing used for more complicated tasks. In order to aid with this, Intel AI PC development collaborates with the Office team to determine which use cases are most appropriate for customers using the Intel Core Ultra processor.
The foundation of Intel and Microsoft’s continued cooperation is a common goal of an AI experience optimized to span cloud and on-device with products such as AI PC. Future Intel processor generations will enhance the availability of client compute for AI workloads. As a result, Intel may anticipate that essential tools like Microsoft Office will be created to provide an excellent user experience by utilizing the finest client and cloud technologies.
Read more on govindhtech.com
0 notes
negativemind · 2 years
Text
Netron:機械学習モデルを可視化するツール
何年も前にMOONGIFTさんで紹介されていたオープンソースの機械学習モデルビューア。 Netron Netronは、ニューラルネットワーク、ディープラーニング、機械学習モデルのビューアです。 Netronは、ONNX, TensorFlow Lite, Caffe, Keras, Darknet, PaddlePaddle, ncnn, MNN, Core ML, RKNN, MXNet, MindSpore Lite, TNN, Barracuda, Tengine, CNTK, TensorFlow.js, Caffe2, UFFをサポートしています。 Netronは、PyTorch, TensorFlow, TorchScript, OpenVINO, Torch, Vitis AI, kmodel, Arm NN, BigDL, Chainer,…
Tumblr media
View On WordPress
0 notes
lambdadeltacommie · 4 days
Text
My complainings about python right now come from me needing to acquire a ONNX model for my pure rust yugioh card generator and everyone who does machine learning only carries around models in some inane python specific format
4 notes · View notes
thriftrescue · 10 days
Text
Tumblr media
ONE PIECE shoes at thrift! "ONNX FASHION XIUXIAN" "OFF XIUXIANFASHION" 3D2Y
3 notes · View notes
hackernewsrobot · 1 month
Text
ONNX: The Open Standard for Seamless Machine Learning Interoperability
https://github.com/onnx/onnx
2 notes · View notes
Building A Responsive Game AI - Part 4
The part where the deadline is in less than a fortnight and the parts don't all fit together as planned.
Tumblr media
The unfortunate nature of working with relatively new software ideas is that many existing systems are incompatible. "How can this be?" you might ask, "surely you can just refactor existing code to make it compatible?" "Coding is one of the few disciplines where you can mould tools to your will, right?" This is true - you can do so.
It does, however, take time and energy and often requires learning that software's complexities to a level where you spend half as much time re-working existing software as it takes to make the new software. Using AI in game engines, for example, has been a challenge. Unity Engine does have an existing package to handle AI systems called "Barracuda". On paper, it's a great system, that allows ONNX based AI models to run natively within a Unity game environment. You can convert AI models trained in the main AI field software libraries into ONNX and use them in Unity. The catch is that it doesn't have fully compatibility with all AI software library functions. This is a problem with complex transformer based AI models specifically - aka. this project. Unity does have an upcoming closed-beta package which will resolve this (Sentis), but for now this project will effectively have to use a limited system of local network sockets to interface the main game with a concurrently run Python script. Luckily I'd already made this networking system since Barracuda doesn't allow model training within Unity itself and I needed a system to export training data and re-import re-trained models back into the engine.
People don't often realise how cobbled together software systems can be. It's the creativity, and lateral thinking, that make these kinds of projects interesting and challenging.
3 notes · View notes
willwade · 2 months
Text
Sometimes you have to do the hard graft as nobody else wants to. huggingface.co/willwade/… - all ONNX models of the Meta MMS Text to Speech models (code: github.com/willwade/… - and all suitable for sherpa-onnx)
0 notes
forlinx · 2 months
Text
Four Advantages Detailed Analysis of Forlinx Embedded FET3576-C System on Module
In order to fully meet the growing demand in the AIoT market for high-performance, high-computing-power, and low-power main controllers, Forlinx Embedded has recently launched the FET3576-C System on Module, designed based on the Rockchip RK3576 processor. It features excellent image and video processing capabilities, a rich array of interfaces and expansion options, low power consumption, and a wide range of application scenarios. This article delves into the distinctive benefits of the Forlinx Embedded FET3576-C SoM from four key aspects.
Tumblr media
Advantages: 6TOPS computing power NPU, enabling AI applications
Forlinx Embedded FET3576-C SoM has a built-in 6TOPS super arithmetic NPU with excellent deep learning processing capability. It supports INT4/ INT8/ INT16/ FP16/ BF16/ TF32 operation. It supports dual-core working together or independently so that it can flexibly allocate computational resources according to the needs when dealing with complex deep learning tasks. It can also maintain high efficiency and stability when dealing with multiple deep-learning tasks.
FET3576-C SoM also supports TensorFlow, Caffe, Tflite, Pytorch, Onnx NN, Android NN and other deep learning frameworks. Developers can easily deploy existing deep learning models to the SoM and conduct rapid development and optimization. This broad compatibility not only lowers the development threshold, but also accelerates the promotion and adoption of deep learning applications.
Tumblr media
Advantages: Firewall achieves true hardware resource isolation
The FET3576-C SoM with RK3576 processor supports RK Firewall technology, ensuring hardware resource isolation for access management between host devices, peripherals, and memory areas.
Access Control Policy - RK Firewall allows configuring policies to control which devices or system components access hardware resources. It includes IP address filtering, port control, and specific application access permissions. Combined with the AMP system, it efficiently manages access policies for diverse systems.
Hardware Resource Mapping and Monitoring - RK Firewall maps the hardware resources in the system, including memory areas, I/O devices, and network interfaces. By monitoring access to these resources, RK Firewall can track in real-time which devices or components are attempting to access specific resources.
Access Control Decision - When a device or component attempts to access hardware resources, RK Firewall will evaluate the access against predefined access control policies. If the access request complies with the policy requirements, access will be granted; otherwise, it will be denied.
Isolation Enforcement - For hardware resources identified as requiring isolation, RK Firewall will implement isolation measures to ensure that they can only be accessed by authorized devices or components.
In summary, RK Firewall achieves effective isolation and management of hardware resources by setting access control policies, monitoring hardware resource access, performing permission checks, and implementing isolation measures. These measures not only enhance system security but also ensure system stability and reliability.
Tumblr media
Advantages: Ultra clear display + AI intelligent repair
With its powerful multimedia processing capability, FET3576-C SoM provides users with excellent visual experience. It supports H.264/H.265 codecs for smooth HD video playback in various scenarios, while offering five display interfaces (HDMI/eDP, MIPI DSI, Parallel, EBC, DP) to ensure compatibility with diverse devices.
FET3576-C SoM notably supports triple-screen display functionality, enabling simultaneous display of different content on three screens, significantly enhancing multitasking efficiency.
In addition, its 4K @ 120Hz ultra-clear display and super-resolution function not only brings excellent picture quality enjoyment, but also intelligently repairs blurred images, improves video frame rate, and brings users a clearer and smoother visual experience.
Tumblr media
Advantage: FlexBus new parallel bus interface
FET3576-C of Forlinx Embedded offers a wide range of connectivity and transmission options with its excellent interface design and flexible parallel bus technology. The FlexBus interface on the SoM is particularly noteworthy due to its high flexibility and scalability, allowing it to emulate irregular or standard protocols to accommodate a variety of complex communication needs.
FlexBus supports parallel transmission of 2/4/8/16bits of data, enabling a significant increase in the data transfer rate, while the clock frequency of up to 100MHz further ensures the high efficiency and stability of data transmission.
In addition to the FlexBus interface, the FET3576-C SoM integrates a variety of bus transfer interfaces, including DSMC, CAN-FD, PCIe2.1, SATA3.0, USB3.2, SAI, I2C, I3C and UART. These interfaces not only enriches the SoM's application scenarios but also enhances its compatibility with other devices and systems.
Tumblr media
It is easy to see that with the excellent advantages of high computing power NPU, RK Firewall, powerful multimedia processing capability and FlexBus interface, Forlinx Embedded FET3576-C SoM will become a strong player in the field of embedded hardware. Whether you are developing edge AI applications or in pursuit of high-performance, high-quality hardware devices, the Folinx Embedded FET3576-C SoM is an unmissable choice for you.
Originally published at www.forlinx.net.
0 notes
audreyshura · 2 months
Text
Microsoft Phi-3 Small Language Model, Big Impact
A New Era in Language Models
The training of large language models has posed significant challenges. Researchers have been striving to create more efficient, cost-effective, and offline-capable language models. After considerable effort, a promising solution has emerged.
Introducing the Microsoft Phi-3 mini, a revolutionary language model trained on 3.3 trillion tokens. This compact powerhouse, developed by the Microsoft research team, is not only lightweight and cost-effective but also highly functional.
The Evolution from Large to Small Language Models
Training large AI models requires vast amounts of data and computing resources. For instance, training the GPT-4 language model is estimated to have cost $21 million over three months. While GPT-4 is powerful enough to perform complex reasoning tasks, it is often overkill for simpler applications, like generating sales content or serving as a sales chatbot.
Microsoft's Approach with Phi-3 Mini
The Microsoft Phi-3 family of open models introduces the most capable small language model (SLM) yet. With 3.8 billion parameters and training on 3.3 trillion tokens, the Phi-3 mini is more powerful than many larger language models.
Microsoft claims that the Phi-3 mini offers an optimal, cost-efficient solution for a wide range of functions. It excels in tasks such as document summarization, knowledge retrieval, and content generation for social media. Moreover, the Phi-3 mini's standard API is available for developers to integrate into their applications, broadening its potential uses.
Performance Compared to Larger Models
Microsoft's Phi-3 small language models outperform other models of similar or larger sizes in key tests. In Retrieval-Augmented Generation tests, the Phi-3 mini outperformed even models twice its size.
Additionally, Microsoft plans to release more models in the Phi-3 family, including the Phi-3-small (7 billion parameters) and Phi-3-medium (14 billion parameters), both of which surpass larger models like GPT-3.5T. These models will be available on Microsoft Azure AI Model Catalog, Hugging Face, and Ollama.
Safety and Responsible AI
The Phi-3 models are developed with a strong focus on responsible AI practices. They adhere to Microsoft's Responsible AI Standard, which emphasizes accountability, transparency, fairness, reliability, safety, privacy, security, and inclusiveness.
Microsoft has implemented stringent safety measures for Phi-3 models, including comprehensive evaluations, red-teaming to identify risks, and adhering to security guidelines. These steps ensure that Phi models are developed, tested, and deployed responsibly.
Opening New Horizons of Capability
The Phi-3 AI models offer unique features and capabilities, making them applicable in various areas:
Resource-Constrained Environments: Suitable for environments with limited computational resources, including on-device and offline scenarios.
Latency-Sensitive Applications: Ideal for real-time processing or interactive systems due to their lower latency.
Cost-Conscious Use Cases: Provide a cost-effective solution for tasks with simpler requirements.
Compute-Limited Inference: Designed for compute-limited environments, optimized for cross-platform availability using ONNX Runtime.
Customization and Fine-Tuning: Easier to fine-tune or customize for specific applications, enhancing adaptability.
Analytical Tasks: Strong reasoning and logic capabilities make it suitable for processing large text content.
Agriculture and Rural Areas: Valuable in sectors like agriculture, where internet access may be limited, improving efficiency and accessibility.
Collaborative Solutions: Organizations like ITC leverage Phi-3 models in collaborative projects to enhance efficiency and accuracy.
Discovering the Phi-3 Small Language Model
Explore the potential of this advanced technology by visiting the Azure AI Playground. The Phi-3 AI model is also available on the Hugging Chat playground. Harness the power of this efficient AI model on Azure AI Studio.
0 notes
sergey-tihon · 3 months
Text
F# Weekly #24, 2024 - Adding #help to fsi
Welcome to F# Weekly, A roundup of F# content from this past week: News Adding #help to fsi Using Phi-3 & C# with ONNX for text and vision samples – .NET Blog (microsoft.com) Refactor your code with default lambda parameters – .NET Blog (microsoft.com) Announcing Third Party API and Package Map Support for .NET Upgrade Assistant – .NET Blog (microsoft.com) Privacy and security improvements…
Tumblr media
View On WordPress
0 notes
elbrunoc · 4 months
Text
Powering Up #NET Apps with #Phi-3 and #SemanticKernel
Hi! Introducing the Phi-3 Small Language Model Hi! Phi-3 is an amazing Small Language Model. And hey, it’s also an easy one to use in C#. I already wrote how to use it with ollama, now it’s time to hit the ONNX version. Introduction to Phi-3 Small Language Model The Phi-3 Small Language Model (SLM) represents a significant leap forward in the field of artificial intelligence. Developed by…
Tumblr media
View On WordPress
0 notes
govindhtech · 3 months
Text
Intel Neural Compressor Joins ONNX in Open Source for AI
Tumblr media
Intel Neural Compressor
In addition to popular model compression techniques like quantization, distillation, pruning (sparsity), and neural architecture search on popular frameworks like TensorFlow, PyTorch, ONNX Runtime, and MXNet, Intel Neural Compressor also aims to provide Intel extensions like Intel Extension for the PyTorch and Intel Extension for TensorFlow. Specifically, the tool offers the following main functions, common examples, and open collaborations:
Limited testing is done for AMD, ARM, and NVidia GPUs via ONNX Runtime; substantial testing is done for a wide range of Intel hardware, including Intel Xeon Scalable Processors, Intel Xeon CPU Max Series, Intel Data Centre GPU Flex Series, and Intel Data Centre GPU Max Series.
Utilising zero-code optimisation solutions, validate well-known LLMs like LLama2, Falcon, GPT-J, Bloom, and OPT as well as over 10,000 wide models like ResNet50, BERT-Large, and Stable Diffusion from well-known model hubs like Hugging Face, Torch Vision, and ONNX Model Zoo. Automatic accuracy-driven quantization techniques and neural coding.
Work together with open AI ecosystems like Hugging Face, PyTorch, ONNX, ONNX Runtime, and Lightning AI; cloud marketplaces like Google Cloud Platform, Amazon Web Services, and Azure; software platforms like Alibaba Cloud, Tencent TACO, and Microsoft Olive.
AI models
AI-enhanced apps will be the standard in the era of the AI PC, and developers are gradually substituting AI models for conventional code fragments. This rapidly developing trend is opening up new and fascinating user experiences, improving productivity, giving creators new tools, and facilitating fluid and organic collaboration experiences.
With the combination of CPU, GPU (Graphics Processing Unit), and NPU (Neural Processing Unit), AI PCs are offering the fundamental computing blocks to enable various AI experiences in order to meet the computing need for these models. But in order to give users the best possible experience with AI PCs and all of these computational engines, developers must condense these AI models, which is a difficult task. With the aim of addressing this issue, Intel is pleased to declare that it has embraced the open-source community and released the Neural Compressor tool under the ONNX project.
ONNX
An open ecosystem called Open Neural Network Exchange (ONNX) gives AI developers the freedom to select the appropriate tools as their projects advance. An open source format for AI models both deep learning and conventional ML is offered by ONNX. It provides definitions for standard data types, built-in operators, and an extendable computation graph model. At the moment, Intel concentrates on the skills required for inferencing, or scoring.
Widely supported, ONNX is present in a variety of hardware, tools, and frameworks. Facilitating communication between disparate frameworks and optimising the process from experimentation to manufacturing contribute to the AI community’s increased rate of invention. Intel extends an invitation to the community to work with us to advance ONNX.
How does a Neural Compressor Work?
With the help of Intel Neural Compressor, Neural Compressor seeks to offer widely used model compression approaches. Designed to optimise neural network models described in the Open Neural Network Exchange (ONNX) standard, it is a straightforward yet intelligent tool. ONNX models, the industry-leading open standard for AI model representation, enable smooth interchange across many platforms and frameworks. Now, Intel elevates ONNX to a new level with the Neural Compressor.
Neural Compressor
With a focus on ONNX model quantization, Neural Compressor seeks to offer well-liked model compression approaches including SmoothQuant and weight-only quantization via ONNX Runtime, which it inherits from Intel Neural Compressor. Specifically, the tool offers the following main functions, common examples, and open collaborations:
Support a large variety of Intel hardware, including AIPC and Intel Xeon Scalable Processors.
Utilising automatic accuracy-driven quantization techniques, validate well-known LLMs like LLama2 and wide models like BERT-base and ResNet50 from well-known model hubs like Hugging Face and ONNX Model Zoo.
Work together with open AI ecosystems Hugging Face, ONNX, and ONNX Runtime, as well as software platforms like Microsoft Olive.
Why Is It Important?
Efficiency grows increasingly important as AI begins to seep into people’s daily lives. Making the most of your hardware resources is essential whether you’re developing computer vision apps, natural language processors, or recommendation engines. How does the Neural Compressor accomplish this?
Minimising Model Footprint
Smaller models translate into quicker deployment, lower memory usage, and faster inference times. These qualities are essential for maintaining performance when executing your AI-powered application on the AI PC. Smaller models result in lower latency, greater throughput, and less data transfer all of which save money in server and cloud environments.
Quicker Inference
The Neural Compressor quantizes parameters, eliminates superfluous connections, and optimises model weights. With AI acceleration features like those built into Intel Core Ultra CPUs (Intel DLBoost), GPUs (Intel XMX), and NPUs (Intel AI Boost), this leads to lightning-fast inference.
AI PC Developer Benefits
Quicker Prototyping
Model compression and quantization are challenging! Through developer-friendly APIs, Neural Compressor enables developers to swiftly iterate on model architectures and effortlessly use cutting-edge quantization approaches such as 4-bit weight-only quantization and SmoothQuant.
Better User Experience
Your AI-driven apps will react quickly and please consumers with smooth interactions.
Simple deployment using models that comply with ONNX, providing native Windows API support for deployment on CPU, GPU, and NPU right out of the box.
What Comes Next?
Intel Neural Compressor Github
Intel looks forward to working with the developer community as part of the ONNX initiative and enhancing synergies in the ONNX ecosystem.
Read more on Govindhtech.com
1 note · View note
ai-news · 4 months
Link
ONNX is an open source machine learning (ML) framework that provides interoperability across a wide range of frameworks, operating systems, and hardware platforms. ONNX Runtime is the runtime engine used for model inference and training with ONNX. A #AI #ML #Automation
0 notes
dreamstate-portfolio · 5 months
Text
Tumblr media
Kim2091 chaiNNer TTA Templates: A Rundown on Test-Time Adaptation and Averaged Outputs
SKIP TO THE MAIN EXPLANATION BY SCROLLING TO THE CHAIN PICTURES
TTA stands for Test-Time Adaptation, it is a method of putting an image through various transformations to produce more accurate output results (an averaged output). When you’re upscaling an image it’s making approximations to fill in the information it doesn’t have. There are many reasons why it has trouble with that information, lighting, artifacts and obscured objects being some primary ones.
Using the TTA method, it applys transformations like rotation, missing parts, exposure and transparency to the image and tests it against each other before combining it into a final result, that is (potentially) more accurate to the original. Many tests are showing that TTA methods could increase accuracy by up to 15%. It does however depend on on what you are upscaling and your model as well, don’t expect TTA to give you 15% better accuracy 100% of the time.
Within upscaling it became more popularized by being a setting in Waifu2x.
Tumblr media
However it has been around before that and is seeing wider use as more industries add machine learning to their process, there is no one singular method to TTA but the ‘standard' for upscaling as seen in this chain preset and in the previously mentioned Waifu2x is 8 rotations of the base image. Although Kim has made lightweight TTA presets, that use less, I’ll be showing as well.
Tumblr media
Note that if you have a slower or older computer TTA will be far more taxing on your system vs a more simple upscale process. Batch processing using TTA will be noticeably longer, no matter what kind of computer you have.
TTA is not something you need to use, if you’re happy with your upscaling results then it would be pointless to waste resources on it. It will typically be used when accuracy matters or when you are having issues getting the results you want the normal ways. For example many of the gifs I will be posting to this account soon will be featuring use of TTA since line and object accuracy matters more heavily when viewed in motion vs a still image. It would also be good for images with heavy artifacts.
When I first heard ‘TTA’ I had no idea what it meant, and couldn’t find much at first beyond ‘it rotates the image, and sometimes your image might look better because of it', even finding comparison images was scarce. This led me down a rabbit hole on TTA. I will not burden any of you with a sources and links list because you would be scrolling for awhile. Instead I will have it listed here https://trello.com/c/COHYy9u9 . It is quite an extensive list, I have tried to take it and summarize it for you (as I understand it) the best I can.
------------------------------------------------------------------------------
Kim2091 made these templates, I am just explaining their use and showing examples. The example chains are pretty simple, but I understand how they can seem overwhelming which is why I have made this. This is beginner to intermediate level, you should have foundational knowledge on chaiNNer and upscaling.
This is the second part of this series. The first is here Kim2091 chaiNNer Templates I also made a starter guide for chaiNNer you can check out chaiNNer (AI) Upscale Basics
All credit for these chains goes to Kim2091, these chains are linked on the chaiNNer github page as well. They're a great resource to use.
------------------------------------------------------------------------------
Some previews have the following edits in PS, find edges filter, b+w adjustment (max blk), and exposure adjustment with gamma turned turned down (0.25).
I have also broken down Kim’s preset a bit to better explain how everything is connecting together.
------------------------------------------------------------------------------
Similarly to the section in Pt 1 on comparison chains, I will be covering all three TTA chains in one go because they’re so similar it would be pointless to make three separate explanations. These chains are very plug and play, despite the scary number of nodes, you don’t actually have to mess with them at all if you don’t want to outside of Load Image/Model and Save Image which should be familiar already.
Tumblr media
LITE
Tumblr media
LITE DUAL (lite TTA chain using two models)
Tumblr media
FULL
------------------------------------------------------------------------------
Now the basics of how the TTA portion of the chain itself works is that it is taking your base image, splitting it for the various transformations (while keeping one un-transformed copy for the base layer that the transformations will apply on top of), applying the transformations (the transformation is applied through the upscale image node and is reverted while keeping the information from the transformed upscale to compare against the upscaled base layer and other transformed upscales), reverting the transformations, merging/averaging the results together and then spitting out the final merged result.
Tumblr media
These are the only actual nodes you need to change anything on, you’ll notice it’s just a basic upscale chain without the TTA additions.
Tumblr media
The image is split and connected to the beginning transformation nodes of the TTA chain
Tumblr media
The ‘base layer’ that is unaffected by transformations
Tumblr media Tumblr media
The main transformations that are reverted further along the chain before the results are averaged together.
Tumblr media
The results being merged together
Tumblr media
The final output
I’m sorry this was so long compared to the simple chains post, I hope it has at least been informative. I wanted to attempt to break down TTA enough for anyone that wanted to experiment with making their own TTA chains, and provide enough background information that what I’m showing can be easily understood and applied to your own upscaling work.
------------------------------------------------------------------------------
Tumblr media Tumblr media
Gifs are out of sync I'll fix them soon I swear.
Still image previews will be uploaded soon and linked here.
0 notes
ardhra2000 · 5 months
Text
7 Tips and Tricks for Optimizing Deep Learning Models in CNTK
The foundation of CNTK's capabilities lies in its support for distributed deep learning. It is adept at handling and processing massive datasets across numerous machines. This feature considerably impacts the computation time, making the training of complex models more practical—leaving no space for the "cntk vs tensorflow" debate.
CNTK's ties to the ONNX format, an open format for AI and machine learning, are integral. This integration enables the deployment of CNTK models across various systems. The ONNX format further simplifies the compatibility issues, allowing the interchange of models between other deep learning frameworks like Tensorflow, making the "cntk tensorflow" interaction smoother.
Batch size is a crucial factor in the training phase of deep learning models. It impacts both learning accuracy and computational speed. In CNTK, developers have the flexibility to tune the batch size to match their hardware capabilities, often leading to significant improvements in model performance. So, when someone discusses "cntk vs tensorflow," the flexible batch size is a significant feature to consider.
Parallelization allows CNTK to handle larger models and datasets than could be run on a single machine. It also makes the process of training deep learning models faster and more efficient. Whether you're comparing CNTK vs TensorFlow or any other framework, appreciating the benefits of parallelism in CNTK machine learning solutions is crucial.
Deep learning models are more than just algorithms; they're intricate structures that learn from data. Good model representation simplifies things, making it easier to understand and work with complex models.
0 notes
jcmarchi · 5 months
Text
Microsoft Unveils Phi-3: Powerful Open AI Models Delivering Top Performance at Small Sizes
New Post has been published on https://thedigitalinsider.com/microsoft-unveils-phi-3-powerful-open-ai-models-delivering-top-performance-at-small-sizes/
Microsoft Unveils Phi-3: Powerful Open AI Models Delivering Top Performance at Small Sizes
Microsoft has introduced Phi-3, a new family of small language models (SLMs) that aim to deliver high performance and cost-effectiveness in AI applications. These models have shown strong results across benchmarks in language comprehension, reasoning, coding, and mathematics when compared to models of similar and larger sizes. The release of Phi-3 expands the options available to developers and businesses looking to leverage AI while balancing efficiency and cost.
Phi-3 Model Family and Availability
The first model in the Phi-3 lineup is Phi-3-mini, a 3.8B parameter model now available on Azure AI Studio, Hugging Face, and Ollama. Phi-3-mini comes instruction-tuned, allowing it to be used “out-of-the-box” without extensive fine-tuning. It features a context window of up to 128K tokens, the longest in its size class, enabling processing of larger text inputs without sacrificing performance.
To optimize performance across hardware setups, Phi-3-mini has been fine-tuned for ONNX Runtime and NVIDIA GPUs. Microsoft plans to expand the Phi-3 family soon with the release of Phi-3-small (7B parameters) and Phi-3-medium (14B parameters). These additional models will provide a wider range of options to meet diverse needs and budgets.
Image: Microsoft
Phi-3 Performance and Development
Microsoft reports that the Phi-3 models have demonstrated significant performance improvements over models of the same size and even larger models across various benchmarks. According to the company, Phi-3-mini has outperformed models twice its size in language understanding and generation tasks, while Phi-3-small and Phi-3-medium have surpassed much larger models, such as GPT-3.5T, in certain evaluations.
Microsoft states that the development of the Phi-3 models has followed the company’s Responsible AI principles and standards, which emphasize accountability, transparency, fairness, reliability, safety, privacy, security, and inclusiveness. The models have reportedly undergone safety training, evaluations, and red-teaming to ensure adherence to responsible AI deployment practices.
Image: Microsoft
Potential Applications and Capabilities of Phi-3
The Phi-3 family is designed to excel in scenarios where resources are constrained, low latency is essential, or cost-effectiveness is a priority. These models have the potential to enable on-device inference, allowing AI-powered applications to run efficiently on a wide range of devices, including those with limited computing power. The smaller size of Phi-3 models may also make fine-tuning and customization more affordable for businesses, enabling them to adapt the models to their specific use cases without incurring high costs.
In applications where fast response times are critical, Phi-3 models offer a promising solution. Their optimized architecture and efficient processing can enable quick generation of results, enhancing user experiences and opening up possibilities for real-time AI interactions. Additionally, Phi-3-mini’s strong reasoning and logic capabilities make it well-suited for analytical tasks, such as data analysis and insights generation.
As real-world applications of Phi-3 models emerge, the potential for these models to drive innovation and make AI more accessible becomes increasingly clear. The Phi-3 family represents a milestone in the democratization of AI, empowering businesses and developers to harness the power of advanced language models while maintaining efficiency and cost-effectiveness.
With the release of Phi-3, Microsoft pushes the boundaries of what is possible with small language models, paving the way for a future where AI can be seamlessly integrated into a wide range of applications and devices.
1 note · View note