模型评估与选择

Posted on 2018-11-13 Edited on 2022-10-14 In machine learning , 模型评估与选择 Views: Valine:
Symbols count in article: 2k Reading time ≈ 2 mins.

本章基本概念

错误率(error rate): 在m个样本中假设有a个样本分类错误，则错误率$E = a/m$,相应地，$1-a/m$称为精度(accuracy)
实际预测输出与样本真实输出的误差称为“误差”(error),在训练集上的误差称为训练(training)误差\经验(empirical)误差，在新样本上的误差称为泛化误差
过拟合overfitting：泛化能力很差；欠拟合(underfitting):对训练样本的一般性质尚未学习好。

Basic Conceptions

Posted on 2018-11-13 Edited on 2022-10-14 In machine learning , basic conception Views: Valine:
Symbols count in article: 820 Reading time ≈ 1 mins.

Basic conception

特征向量(feature vector): 每一个分量对应一种属性(attribute)/特征(feature)的值，构成的空间为属性空间
一般地，令$D = \{x_1,x_2,…x_m\}$表示包含m个示例的数据集，每个示例由d个属性描述，则每个示例$x_i = (x_{i1};x_{i2};…;x_{id})$是d维样本空间$\chi$中的一个向量，$x_i\in \chi$,d为样本$x_i$的维数
训练样本的结果称为标记(label)；拥有了标记信息的示例，称为样例(example),一般用$(x_i,y_i)$表示
学习任务大致可分为两类：监督学习(supervised learning)和无监督学习(unsupervised learning)，分类(classification)和回归(regression)是前者的代表，而聚类则是后者的代表。
泛化(generalization)能力:所学的模型适用于新样本的能力；
通常假设样本空间全体样本服从一个未知的分布(distribution),获得的每个样本都是独立同分布的(independent and identically distributed),i.i.d
归纳(induction):从特殊到一般的泛化过程；演绎(deduction):从一般到特化.
演绎空间：样本x每一个属性的可能取值所构成的空间，其中有可能是*,代表此属性取什么都行，也有可能整个空间为空集。

版本空间(version space):如果假设空间中有多个与训练集一致的假设，即存在一个与训练集一致的“假设集合”
归纳偏好(inductive bias):算法在学习过程中对某种类型假设的偏好

“奥卡姆剃刀”原则：若有多个假设与观察一致，则选择最简单的那个。
No Free Lunch Theorem: 所有算法的期望性能是相同的，前提是所有“问题”出现的机会相同、或者所有问题同等重要，但是实际情况不是这样

python numpy练习

Posted on 2018-11-11 Edited on 2022-10-14 In language , python Views: Valine:
Symbols count in article: 21k Reading time ≈ 19 mins.

以下为转载内容

100 numpy exercises

CloudSim Structure

Posted on 2018-11-09 Edited on 2022-10-14 In Cloud Workflow Scheduling Views: Valine:
Symbols count in article: 4.9k Reading time ≈ 4 mins.

Basic Concepts and Terminologies

Layered Design

看见_柴静

Posted on 2018-11-07 Edited on 2022-10-14 In Life Views: Valine:
Symbols count in article: 2.8k Reading time ≈ 3 mins.

简介

《看见》是柴静讲述央视十年历程的自传性作品，既是柴静个人的成长告白书，在某种程度上亦可视作中国社会十年变迁的备忘录

A Cost-Effective Deadline-Constrained Dynamic Scheduling Algorithm for Scientic Workflows in a Cloud Environment

Posted on 2018-11-06 Edited on 2022-10-14 In Cloud Workflow Scheduling Views: Valine:
Symbols count in article: 2.4k Reading time ≈ 2 mins.

Benefits and Issues of Cloud

Benefits

Interesting Things

Posted on 2018-11-04 Edited on 2022-10-14 In Life Views: Valine:
Symbols count in article: 59 Reading time ≈ 1 mins.

密码学的基本原理

Java基本知识点

hexo configuration on multiple computers

Posted on 2018-11-04 Edited on 2022-10-14 In Entertainment Views: Valine:
Symbols count in article: 736 Reading time ≈ 1 mins.

Todo for myself

Install node.js
Install hexo
1
npm install -g hexo

Download latest version in my Gituhb

1	git clone https://github.com/yantijin/yantijin.github.io.git

Download latest themes configuration in another foler
1
git clone https://github.com/yantijin/Theme_Next.git
Then copy the theme into github.yantijin.io/themes
Install dependencies for your blog
1
npm install

Something needs to modifiy

About MATHJAX

1 2	npm uninstall hexo-readerer-marked --save npm install hexo-renderer-kramed --save

Next step is to modify the files in node_modules/kramed/lib/rules/inlime.js.

Change line 10 and line 20:

// escape: /^\\([\\`*{}\[\]()#$+\-.!_>])/,
escape: /^\\([`*\[\]()#$+\-.!_>])/,
// em: /^\b_((?:__|[\s\S])+?)_\b|^\*((?:\*\*|[\s\S])+?)\*(?!\*)/,
 em: /^\*((?:\*\*|[\s\S])+?)\*(?!\*)/,

REF for Detail

Process Mining based on the Markov Transition Matrix

Posted on 2018-11-03 Edited on 2022-10-14 In Process Mining , Markov Transition Matrix Views: Valine:
Symbols count in article: 3.2k Reading time ≈ 3 mins.

Main Thoughts

First analyses the workflow log transition probabilities between activities to build the Markov transition matrix;
Then mines the process logic relationship by defining a set of rules of logical relational
At last design the process mining algorithm to establish the actual structure relationship between the activities in order to reconstruct the workflow.

FirstTest

Posted on 2018-11-02 Edited on 2022-10-14 In Entertainment Views: Valine:
Symbols count in article: 277 Reading time ≈ 1 mins.

This is a test on hexo plus github

Head 2